ABBYY Robotic Information Capture applies machine learning to capture

Back in the SDK track at ABBYY Technology Summit, I attended a session on “robotic information capture” with FlexiCapture Engine 12, with lead product manager Andrew Zyuzin and director of product marketing Semyon Sergunin showing some of the automation classification and data extraction capabilities powered by machine learning. Traditional enterprise capture uses manually-created rules for classification and data extraction to set up for automated capture: a time-consuming training process up front in order to maximize recognition rates. At the other end of the spectrum, robotic process automation uses machine learning to analyze user actions, and create classification and extraction algorithms that can be run by robots to replace human operators. In the Goldilocks middle, they position robotic information capture as a blending of these two ideas: the system is pre-trained and processes standard documents out of the box, then uses machine learning to enhance the recognition for non-standard documents by analyzing how human operators handle the exceptions. Although I’m not completely aligned with their use of the term robotic process automation since RPA is not completely synonymous with machine learning and also isn’t limited to capture applications, I understand why they’re positioning their ML-assisted capture as robotic information capture as a middle ground between traditional capture and ML-assisted RPA.

We saw a demo of this with invoice capture: a PDF invoice was processed through their standard invoice recognition, detecting vendor name and invoice number, but the wrong number was picked up for the total amount due to the location of the field. This was corrected by a user in the verification client, and the information of where to find the total was analyzed for retraining and fed back to the recognition model. The user doesn’t know that they’re actually training the system — there’s no explicit training mode — but it just happens automatically in the background for continuous improvement of the recognition rates, gradually reducing the amount of manual verification. After the training was fed back, we saw another invoice from the same vendor processed, with the invoice total field properly detected.

Although I think that most technology is pretty interesting, this is the first thing I’ve seen today that made me say “cool!”

Zyuzin also walked us through their advanced classification, which can classify documents without any development based on large data sets of typical document types such as invoices, cheques, and drivers licences; automatic classification is important as the front end to recognition so that the correct recognition techniques and templates can be applied. Their advanced classification uses both image and content classification, that is, determines what type of document it is based on how it looks as well as the available text content. He showed us a demo of processing a package of mortgage documents, where there is a large number of possible documents that can be submitted by a consumer as supporting documentation; most of the documents were properly classified, but a few were unrecognized and required a quick setup of a new document type to train the classifier. This was more of a manual training process, but once the new document class was created, it could be applied to other unrecognized documents in the package.

3 thoughts on “ABBYY Robotic Information Capture applies machine learning to capture”

  1. Hi Sandy, yes using ML to train document recognition is actually cool, but we at Papyrus Software have been doing that for over ten years!!! Using a ‘golden set’ to train the basis is not new at all because that we have been doing since 1997. The interesting solution is to continuously train capture based on user feedback which many refer to today as ‘deep learning’. Papyrus Capture is capable of automatically assigning document classes and variants based on the document variations. We also use ABBYY components for OCR so I am not saying these are bad solutions but they are most certainly not leading edge.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.