I attended two back-to-back sessions from the SDK track in the first round of breakouts at the 2017 ABBYY Technology Summit. All of the products covered in these sessions are developer tools for building OCR capabilities into other solutions, not end-user tools.
Semyon Sergunin, director of product marketing for ABBYY‘s SDK products, gave us a high-level update and a bit of the roadmap for all of the SDK products. For reference, FineReader Engine is an OCR toolkit, while FlexiCapture Engine is based on the same technology but is an SDK for document separation, classification and data extraction.
FineReader Engine 12:
- New OCR support for Farsi and Burmese languages, and improved OCR for Japanese
- Improved layout retention, so that the recognized/exported document in plain text or structured document formats (MS Office) looks more like the original
- Improved automation of document classification and data extraction using machine learning
- Additional export formats (ALTO, PDF/A 2-b and 3-b), and improvements to some existing ones (XML, TXT)
He also discussed some of their licensing changes, including cloud licenses for Azure public cloud and virtual cloud instances.
FlexiCapture Engine 12:
- New classification and PDF export features supported via the API
- Update to latest version of OCR technologies
- Processing of natively-digital documents (email, text, MS-Word), not just images
- Cloud licensing
- Changes to classification logic depending on whether the text or image version of the content is available
- Processing of PDFs with text layers
- Linux support using a Wine wrapper
Receipts Capture SDK:
- Available on Windows, Linux (via Wine) and cloud
- Supports 120 major US vendor receipt styles
- Added field-level confidence levels, not just character or word confidence
- Added manual verification service
Mobile real-time recognition SDK:
- Built-in support for bank cards, passports, several different states’ drivers licenses, and regular expressions
- Combined SDK for video or still photo input on mobile
Cloud OCR SDK:
- Same functionality as FineReader Engine, plus a few extras such as receipt recognition
- Subscription and package pricing
There’s also a new FlexiCapture Cloud product in beta now, providing the additional functionality for document classification and data extraction.
The details here are primarily of interested to technical developers who are working with ABBYY products (or planning to), but the amount of new information shows a good rate of innovation. This was a fast high-level update, although more detail than we saw in the analyst briefing yesterday; there will be more information coming in later breakout sessions.
This was followed by a deep dive session on the use of FineReader Engine, with Larysa Lototska, technical marketing manager, and Tony Connell, pre-sales engineer. They covered the following topics:
- Licensing, both runtime and developer
- Improving recognition accuracy by using predefined profiles for specific types of documents or data extraction, e.g., engineering drawings or business cards; and by applying additional settings via code
- Improving recognition speed by changing the engine loading method; using multiple CPU cores or concurrent recognition processes; using parallelism for multiple pages within documents; and batch scanning for batches of documents with the same number of pages (including single-page documents)
They gave live demos showing how to use some of the different profiles and settings in sample code in Visual Studio, applying methods for classifying and recognizing particularly difficult or degraded images.
They also discussed turning on the FineReader Engine log file to track down performance problems, since it tracks and timestamps every engine call plus any errors that are thrown, and walked through various sources of developer help on their site and bundled with the SDK.
There are a lot of interesting sessions at the conference: even with only three tracks, I’m having trouble deciding what to attend in some time slots.