An important concern in any news network is receiving news reports from news agencies or journalists. SINA Ingest Automation goal is to automate this process. If one or several news reports are received or ingested as a file, this system distinguishes the news items by image processing and extracts each item in addition to its description and then prepares it for news systems as a movie besides descriptions in xml format.
Using SINA Ingest Automation, there is no need for manual titling and therefore, the users’ failures will be remarkably reduced and Items will enter the news workflow in the least time. In this system, a text processing filter is used to remove image processing failures.
SINA Ingest Automation receives English reports sent from news agencies and at the same time receives reports produced by local journalists that might be Farsi, Arabic or English. Thus, it will provide a unique method for news resources to be received and there will be no difference in inserting reports into the news workflow.
- Extracting texts from received images
- English OCR system to extract text from image
- Defining profiles for different news resources
- Watermark for transferring Arabic or Farsi texts besides images
- Automatic creation of preliminary metadata for news Items
- Extraction of news items existing in a file and creating separate files