Capture microservices for BPO with iCapt and ABBYY

Claudio Chaves Jr. of iCapt presented a session at ABBYY Technology Summit on how business process outsourcing (BPO) operations are improving efficiencies through service reusability. iCapt is a solutions provider for a group of Brazilian companies, including three BPOs in specific verticals, a physical document storage company, and a scanner distributor. He walked through a typical BPO capture flow — scan, recognize, classify, extract, validate, export — and how each stage can be implemented using standalone scan products, OCR SDKs, custom UIs and ECM platforms. Even though this capture process only outputs data to the customer’s business systems at the end, such a solution needs to interact with those systems throughout for data validation; in fact, the existing business systems may provide some overlapping capabilities with the capture process. iCapt decided to turn this traditional capture process around by decoupling each stage into independent, reusable microservices that can be invoked from the business systems or some other workflow capability, so that the business system is the driver for the end-to-end capture flow. The microservices can be invoked in any order, and only the ones that are required are invoked. As independent services, each of them can be scaled up and distributed independently without having to scale the entire capture process.

The recognize, classify and extract steps are typically unattended, and became immediate candidates to be implemented as microservices. This allows them to be reusable across processes, scaled independently, and deployed on-premise or in the cloud. For example, a capture process that is used for a single type of document doesn’t require the classification service, but only uses the recognize and extract services; another process that uses all three may reuse the same recognize and extract services when it encounters the same type of document as the first process handles, but also uses the classify service to determine the document type for heterogeneous batches of documents. iCapt is using ABBYY FineReader as a core component in their iCaptServices Cloud offering, embedded within their own web APIs that offer higher-level services on top of the FRE core functions; the entire package can be deployed as a container or serverless function to be called from other applications. They provide services for mobile client development to allow these business applications to have capture on mobile devices.

He gave an example of a project that they did for recovering old accounting records by scanning and recognizing paper books; this was a one-time conversion project, not an ongoing BPO operation, making it crucial that they be able to build the data capture application quickly without developing an excessive amount of custom code that would have been discarded after the 10-week project duration. They’re currently using the Windows version of ABBYY which increases their container/cloud costs somewhat, and are interested in trying out the Linux version that we heard about yesterday.