Back in the SDK track at ABBYY Technology Summit, I attended a session on “robotic information capture” with FlexiCapture Engine 12, with lead product manager Andrew Zyuzin and director of product marketing Semyon Sergunin showing some of the automation classification and data extraction capabilities powered by machine learning. Traditional enterprise capture uses manually-created rules for classification and data extraction to set up for automated capture: a time-consuming training process up front in order to maximize recognition rates. At the other end of the spectrum, robotic process automation uses machine learning to analyze user actions, and create classification and extraction algorithms that can be run by robots to replace human operators. In the Goldilocks middle, they position robotic information capture as a blending of these two ideas: the system is pre-trained and processes standard documents out of the box, then uses machine learning to enhance the recognition for non-standard documents by analyzing how human operators handle the exceptions. Although I’m not completely aligned with their use of the term robotic process automation since RPA is not completely synonymous with machine learning and also isn’t limited to capture applications, I understand why they’re positioning their ML-assisted capture as robotic information capture as a middle ground between traditional capture and ML-assisted RPA.
We saw a demo of this with invoice capture: a PDF invoice was processed through their standard invoice recognition, detecting vendor name and invoice number, but the wrong number was picked up for the total amount due to the location of the field. This was corrected by a user in the verification client, and the information of where to find the total was analyzed for retraining and fed back to the recognition model. The user doesn’t know that they’re actually training the system — there’s no explicit training mode — but it just happens automatically in the background for continuous improvement of the recognition rates, gradually reducing the amount of manual verification. After the training was fed back, we saw another invoice from the same vendor processed, with the invoice total field properly detected.
Although I think that most technology is pretty interesting, this is the first thing I’ve seen today that made me say “cool!”
Zyuzin also walked us through their advanced classification, which can classify documents without any development based on large data sets of typical document types such as invoices, cheques, and drivers licences; automatic classification is important as the front end to recognition so that the correct recognition techniques and templates can be applied. Their advanced classification uses both image and content classification, that is, determines what type of document it is based on how it looks as well as the available text content. He showed us a demo of processing a package of mortgage documents, where there is a large number of possible documents that can be submitted by a consumer as supporting documentation; most of the documents were properly classified, but a few were unrecognized and required a quick setup of a new document type to train the classifier. This was more of a manual training process, but once the new document class was created, it could be applied to other unrecognized documents in the package.