ABBYY SDK update and FineReader Engine deep dive

I attended two back-to-back sessions from the SDK track in the first round of breakouts at the 2017 ABBYY Technology Summit. All of the products covered in these sessions are developer tools for building OCR capabilities into other solutions, not end-user tools.

Semyon Sergunin, director of product marketing for ABBYY‘s SDK products, gave us a high-level update and a bit of the roadmap for all of the SDK products. For reference, FineReader Engine is an OCR toolkit, while FlexiCapture Engine is based on the same technology but is an SDK for document separation, classification and data extraction.

FineReader Engine 12:

  • New OCR support for Farsi and Burmese languages, and improved OCR for Japanese
  • Improved layout retention, so that the recognized/exported document in plain text or structured document formats (MS Office) looks more like the original
  • Improved automation of document classification and data extraction using machine learning
  • Additional export formats (ALTO, PDF/A 2-b and 3-b), and improvements to some existing ones (XML, TXT)

He also discussed some of their licensing changes, including cloud licenses for Azure public cloud and virtual cloud instances.

FlexiCapture Engine 12:

  • New classification and PDF export features supported via the API
  • Update to latest version of OCR technologies
  • Processing of natively-digital documents (email, text, MS-Word), not just images
  • Cloud licensing
  • Changes to classification logic depending on whether the text or image version of the content is available
  • Processing of PDFs with text layers
  • Linux support using a Wine wrapper

Receipts Capture SDK:

  • Available on Windows, Linux (via Wine) and cloud
  • Supports 120 major US vendor receipt styles
  • Added field-level confidence levels, not just character or word confidence
  • Added manual verification service

Mobile real-time recognition SDK:

  • Built-in support for bank cards, passports, several different states’ drivers licenses, and regular expressions
  • Combined SDK for video or still photo input on mobile

Cloud OCR SDK:

  • Same functionality as FineReader Engine, plus a few extras such as receipt recognition
  • Subscription and package pricing

There’s also a new FlexiCapture Cloud product in beta now, providing the additional functionality for document classification and data extraction.

The details here are primarily of interested to technical developers who are working with ABBYY products (or planning to), but the amount of new information shows a good rate of innovation. This was a fast high-level update, although more detail than we saw in the analyst briefing yesterday; there will be more information coming in later breakout sessions.

This was followed by a deep dive session on the use of FineReader Engine, with Larysa Lototska, technical marketing manager, and Tony Connell, pre-sales engineer. They covered the following topics:

  • Licensing, both runtime and developer
  • Improving recognition accuracy by using predefined profiles for specific types of documents or data extraction, e.g., engineering drawings or business cards; and by applying additional settings via code
  • Improving recognition speed by changing the engine loading method; using multiple CPU cores or concurrent recognition processes; using parallelism for multiple pages within documents; and batch scanning for batches of documents with the same number of pages (including single-page documents)

They gave live demos showing how to use some of the different profiles and settings in sample code in Visual Studio, applying methods for classifying and recognizing particularly difficult or degraded images.

They also discussed turning on the FineReader Engine log file to track down performance problems, since it tracks and timestamps every engine call plus any errors that are thrown, and walked through various sources of developer help on their site and bundled with the SDK.

There are a lot of interesting sessions at the conference: even with only three tracks, I’m having trouble deciding what to attend in some time slots.

The collision of capture, content and analytics

Martyn Christian of UNDRSTND Group, who I worked with back in FileNet in 2000-1, gave a keynote at ABBYY Technology Summit 2017 on the evolution and ultimate collision of capture, content and analytics. He started by highlighting some key acquisitions in the industry, including the entry of private capital, as well as a move to artificial intelligence in the capture space, as harbingers of the changes in the capture market. Since Gartner declared enterprise content management dead — long live content services platforms! — and introduced new players in the magic quadrant alongside the traditional ECM players, while shifting IBM from the leaders quadrant back to the challengers quadrant.

Intelligent capture is gaining visibility and importance, particularly as a driver for digital transformation. Interestingly, capture was traditionally about converting analog (paper) to digital (data); now, however, many forms of information are natively digital, and capture is not only about performing OCR on scanned paper documents but about extracting and analyzing actionable data from both analog and digital content. High-volume in-house production scanning operations are being augmented — or replaced — with customers doing their own capture, such as we now see with depositing a check using a mobile banking application. Information about customer actions and sentiment is being automatically gleaned from their social media actions. Advanced machine learning is being used to classify content, reducing the need for manual intervention further downstream, and enabling straight-through processing or the use of autonomous agents.

As a marketing guy, he had a lot of advice on how this can be positioned and sold into customers; UNDRSTND apparently ran a workshop yesterday for some of the channel partner companies on bringing this message to their customers who are seeking to move beyond simple capture solutions to digital transformation.

ABBYY corporate vision and strategy

We have a pretty full agenda for the next two days of the 2017 ABBYY Technology Summit, and we started off with an address from Ulf Persson, ABBYY’s relatively new worldwide CEO (although he is a long-time board member). As a company with its roots in Russia that has spread from country to country in a bit of a disjointed way in the past, Persson is pushing the idea of #OneABBYY: a global company rather than a collection of regional companies. A leader in OCR since 1989, ABBYY is the best-kept secret in OCR: end customers don’t know who they are, and even other vendors in the same space haven’t heard the name, unlike that of their competitors. They are actively trying to change how they grow: becoming more globally balanced in terms of development, marketing and organizational structure; becoming more agile and customer-centric; and continuing to be profitable and innovative. Their revenues are well-diversified, with 33% in Europe, 28% in North America, global accounts (bundling with hardware vendors) 19%, then smaller segments in Russia, Africa and Australia.

Their strategy includes:

  • Increasing market share in enterprise capture by pushing intelligent capture solutions, primarily as a cloud service.
  • Becoming the partner of choice for ISVs that need to build capture capabilities into their solutions. Unlike some other capture vendors, they are not looking to push into adjacent spaces, such as BPM, but plan to stay as an indepedent vendor in the intelligent capture and automation market that can partner with a wide variety of hardware, software and solution providers.
  • Becoming a leader in text analytics solutions, driven by the data that they capture from documents. He mentioned contracts in particular, where very complex text analytics are required to automate understanding of these types of documents.

They are making use of machine learning and artificial intelligence in their capture technology, and offering real-time recognition as a service or as embedded technology.

As a self-funded profitable company, they don’t need to go to the markets for funding, and state outright that they are not for sale.

Disclaimer: ABBYY has been my customer in the past — I gave the keynote here last year as well as another presentation in Toronto, and wrote a white paper — but I am not being compensated for my time here this week or for writing these posts. ABBYY did pay for my flights and hotel, which is the usual deal that I have with vendors to attend their conferences and blog my thoughts about what I see.

ABBYY analyst briefing

I’m in San Deigo for a quick visit to the ABBYY Technology Summit. I’m not speaking this year (I keynoted last year), but wanted to take a look at some of the advances that they’re making in intelligent capture. We had an analyst briefing today in advance of the general conference tomorrow, and some of this is a preview for those more detailed sessions.

ABBYY’s legacy is in the OCR SDK business, which allows their partners to build solutions that include intelligent data capture from scanned documents. They’re moving beyond that with mobile capture products and cloud capture solutions. They have a lot of flexibility with mobile and cloud, allowing for both hybrid solutions that use mobile for capture with recognition on a more powerful cloud platform, and for mobile-only data extraction that operates completely on the mobile platform. Since there are components for this in addition to packaged solutions, a developer can create a mobile application that makes decisions about what type of recognition to do, and where to do it. This uses a real-time recognition SDK that uses video feed to do self-correction based on several frames of video rather than just a single snap, or simpler recognition based on still photos.

Their cloud OCR service supports a community of more than 65,000 developers with 69,000 connected applications: a great use of distributed microservices from applications that need OCR but don’t want to own that technology. Assuming that privacy issues can be satisfied (since you’re sending them potentially private documents), many organizations could benefit from OCR but may not be able to afford to own a high-performance solution in-house.

They also have some packaged solutions for receipt capture and identity documents (e.g., passports). They also have a Linux version of their OCR services (their primary products are Windows and Azure-based), which is popular in certain markets.

They covered some of the market trends in capture:

  • Less of a discrete technology and more of an embedded capability in business applications
  • No longer production-line capture, but more intelligent capture of heterogeneous documents; there’s a great deal more diversity in document type and point of origin
  • Organizations are using this to automate where possible, particularly for front-end work such as capture

This requires core capabilities of processing large volumes of documents where the content may be diverse and frequently changing, and the seamless interaction of content coming from a variety of input streams with a larger business solution. Back in the earlier days of “imaging and workflow”, we had dedicated scanning and recognition workstations for high-volume assembly line processing of documents that were all the same, or manually classified. Now, we need to have this happening on any computer, or on a mobile device, at any point in a process.

ABBYY’s products are aiming to address these changing market conditions with autonomous classification and train-by-example data extraction, including some clever processing of related documents: if you have two documents from someone with the same piece of information (e.g, a SSN), and the confidence level is low on the recognition of that information on one of the documents but high on the other, the stronger confidence level can be used to boost the confidence of the lower level. They also have improved integration capabilities including being able to embed the capture capabilities as an iframe in an html page. and an increased number of input channel types. They’re working with some of the low-code vendors since it’s now pretty straightforward to map the outputs from an ABBYY service to the inputs of a data-centric low-code application.

Their core customer base is still banking — at least in North America — but they are starting to see growth in insurance and other markets. Their number one use case is invoice processing, and they have a packaged application to address that sector. The mid market is underserved in terms of automating invoice processing; it’s a tough problem since inbound invoices can be in any sort of format, and there’s quite a bit of intelligence to ensure that all of the data is extracted correctly. Note that a lot of larger enterprises either have EDI-type processes for invoices, or force smaller vendors to login to a dedicated invoicing portal to submit an invoice in the enterprise’s format rather than the vendor’s usual format, and are more likely to have automated capture processes for the remainder. ABBYY’s goal is to complement existing accounts payable solutions by being built in as the front-end capture component, not to displace these systems.

This briefing drives home that ABBYY is the best-kept secret in intelligent capture because they work primarily through partners who bundle their capture technology into vertical solutions, but don’t have as much visibility to the end customers. Most of the enterprise customers that I talk to have never heard of ABBYY, although they may have it running in their organizations embedded within another application. Even other vendors in this space, such as BPM and low-code vendors, don’t know the name. This is a bit different from ABBYY’s competitors Kofax and Captiva, who both have had a lot of end-customer solutions that move beyond capture in addition to capture-related SDKs, or IBM’s Datacap, which does some of that but also comes in on the coattails of a suite of IBM products such as ECM and BPM. Whether ABBYY can change this market visibility themselves — or if they even want to — will be an interesting positioning going forward.

More on all of this tomorrow and Friday.

Low code and case management discussion with @TIBCO

I’m speaking on a webinar sponsored by TIBCO on November 9th, along with Roger King (TIBCO’s senior director of product management and strategy, and Austin Powers impressionist extraordinaire) and Nicolas Marzon (TIBCO’s director of strategic enablement group). From their registration page:

Supercharge your digital transformation – When low code meets case management

While digital transformation is likely on your company’s agenda, the demand for ever more enterprise apps is not slowing down. How can you both transform and meet this development need?

Process-centric applications that run your business involve content, events, decisions, and automation. Knowledge workers benefit from environments that integrate all of these capabilities in a case management paradigm, which combines automation with human reasoning. And new low-code development platforms will let your business users configure their own case management apps to meet their situational or strategic needs.

This is not a structured presentation or TIBCO demo: instead, I’ll kick off with a couple of level-setting slides on case management and low code platforms, then lead a discussion with Roger and Nicolas on a variety of issues facing us with low code and case management. Some of the things on my list of potential topics:

  • What are the business and technology drivers pushing us towards low code?
  • How do we reconcile citizen developers’ situational applications with a broader architecture and design vision?
  • How are low code platforms and their developers supported by a center of excellence without squashing innovation?
  • How do we handle governance of low code apps to make sure that they don’t do anything that might negatively impact privacy or performance?
  • What sort of organizational links do we need to bring together microservices developers and low code platforms?

If you have some other topics that you’d like to hear us discuss, please add them as comments below and I’ll try to work them in. Sign up for the webinar at the registration link above and join in on November 9th.

I’m also working on a couple of white papers for them on case management and low code, which is coming up in pretty much every business and technical discussion that I have these days. At least one of those papers will be available by the time of the webinar, with the other available shortly after.

Financial decisions in DMN with @JanPurchase

Trisotech and their partner Lux Magi held a webinar today on the role of decision modeling and management in financial services firms. Jan Purchase of Lux Magi, co-author (with James Taylor) of Real-World Decision Modeling with DMN, gave us a look at why decision management is important for financial services. One of the key places for applying decision management is in compliance, which is all about decision-making: assessing risks, applying regulations, sharing data, and ensuring that rules are applied in a uniform manner. There are a lot of other areas where decision management can be applied, and potentially automated where this is a high volume/speed of transactions with a non-zero cost of errors. Decision management lets you make decisions explicit: it separates them from other business software to increase transparency and agility, and makes it easier for business people to understand what decisions are being applied and how that links to overall business goals. In particular, if decisions are automated with a decision management system, business people can quickly make changes to decision-making when compliance regulations change, with a much smaller IT involvement that would be required to modify legacy business systems.

There is a great deal of value in modeling decisions even if they are embedded within business systems and won’t be automated using a decision management system: decision models provide a way for business people to specify how systems should behave based on business data. Luckily, there is now a standard for decision modeling: Decision Model and Notation (DMN). This notation allows a decision to be modeled as a Decision Requirements Diagram (DRD) of the sub-decisions and knowledge sources that are required to reach that decision, and the possible paths to take in order to reach the decision. Within each of the decision nodes in the DRD, a definition of the decision can be specified using a decision table or the Friendly Enough Expression Language (FEEL), which may then be linked to an automated decision management system.

We then saw what a decision model looks like in Trisotech’s DMN Modeler, which allows for a standard DRD to be created, then augmented with additional information such as decision makers and owners. Purchase walked us through a number of the features of DMN as well as specific features of Trisotech’s tool, including analysis of decisions relative to Bruce Silver’s Method and Style best practices, and decision animation.

Lux Magi/Trisotech DMN 2017-10

If you know a bit about DMN already but want to understand some of the practical aspects of working with it in financial services, I assume that a replay of the webinar will be available at the original registration link or the Lux Magi event page.

ABBYY Technology Summit 2017

Welcome back after a nice long summer break!

Last year, I gave the keynote at ABBYY’s Technology Summit, and I’m headed back to San Diego this year to just do the analyst stuff: attend briefings and hang out at the conference sessions. This will give me a chance to do more writing than when I’m presenting; last year, I only had time to blog about one session at ABBYY’s conference.

You can find out more about the summit and register to attend here, and more about ABBYY and their intelligent capture products here. Looks like some interesting sessions, including those on improving capture and recognition with machine learning.

Readers may have noticed that I’ve severely curtailed my conference travel in the past year or two. Large conferences just aren’t good value for my time, especially for vendors that have a wide variety of products outside my area of interest:  when I attend a conference, I’m giving up billable time to be there so need to gain useful information or make valuable contacts to make it worth my while. I will almost always attend a client’s conference even if I’m not speaking, or a smaller conference that looks interesting, or one in an interesting location. As described on my Legal page, a vendor must cover my travel expenses to have me attend their conference but doesn’t provide any other remuneration unless I’m also giving a presentation at the conference.

Insurance case management: SoluSoft and OpenText

It’s the last session of the last morning at OpenText Enterprise World 2017 — so might be my last post from here if I skip out on the one session that I have bookmarked for late this afternoon — and I’m sitting in on Mike Kremer of OpenText and Kiran Thakrar of SoluSoft showing SoluSoft’s Active Client Management for Insurance, built on OpenText’s Process Suite and case management. SoluSoft originally built this capability on earlier OpenText products (Global 360) but have moved to the new low-code platform. Their app can be used out of the box, or can be configured to suit a particular environment.

The goal of Active Client Management for Insurance is to provide a 360 view of the client, including data from a variety of sources (typically systems of record for policy administration or claims), content from any repository, open tasks and pending actions, checklists and ad hoc notes. It includes the entire customer lifecycle, from onboarding and underwriting, through policy administration and claims; basically, it’s user work management and CRM in one.

The solution is built on the core of Process Suite, using the full entity modeling AppWorks-style low-code development. It also includes process intelligence for analytics, Capture Center for document capture, and Streamserve for customer communication management. Above all of these OpenText building blocks, SoluSoft has built a client management solution accelerator that (I believe) they can use for a variety of vertical applications; below the OpenText layer is a service bus integration to line of business systems. For insurance, they’ve created a number of business processes and request types corresponding to different parts of the business, such as processing a new application, amending a policy, or initiating a claim; each of these can be configured for the specific customer’s terminology, or disabled if they don’t require specific functions. It’s not completely clear, however, how much of the functionality of other insurance systems might be replaced by this rather than augmented: clearly, the core policy administration system stays as the system of record, but an underwriting or claims system workflow might be replaced by this functionality. Having done this a few times with clients that use systems such as Guidewire, I have to say that this is a non-trivial architectural exercise to decide what parts of the flow happen where, and how to properly interact with other systems.

At the heart is a generic capture-driven workflow: scan, capture, index, data entry, process, approve, review, fulfill. The names of these can be aliased for different vertical applications — their example is that “processing” becomes “underwriting” — and steps can be skipped for a specific request type. Actions that can be performed at any of these work steps are configured using checklists. Ad hoc processes can be attached to steps in this master flow, either a single-step task or a more complex flow, and be executed before, after or in parallel to the pre-defined work step. Ad hoc processes can be created at runtime, and secondary request processes created for certain case types. The ability to make any of these configuration changes is restricted by role security. Relationships between clients, policies, brokers, claims, etc. are managed using folders for customers, policies and advisers, driven by entity modeling in Process Suite (AppWorks Low Code); this ability to establish relationships between all of these types of entities is critical for providing the complete view of the customer. They also have integrated iHub analytics for showing case statistics and workload analysis, as well as more complex analysis of risk or profitability for specific customer groups or policy types.

 

Although SoluSoft built some of this in custom code. a lot of the application is built directly in the OpenText low code development environment provided by Process Suite. This means that it’s fast to configure or even do some basic customizations, with the caveats that I mentioned earlier about deciding on where some parts of the workflow might happen when you have existing LOB systems that already do that. It also provides them with native mobile support through AppWorks, rather than having to build a separate mobile application.

We saw the version focused on insurance, but they also have flavors for pensions, financial services, government, healthcare and education. However, it appears that there is an existing legacy of the Global 360-based application, and it’s not clear how long it will take for this new AppWorks version to make its way into the market.

Getting started with OpenText case management

I had a demo from Simon English at the OpenText Enterprise World expo earlier this week, and now he and Kelli Smith are giving a session on their dynamic case management offering. English started by defining case management:

  • Management of dynamic, unstructured processes
  • Processes are driven by events or human interactions to support faster, more accurate decisions
  • Decisions are tied to content and the case directs that content to the right conclusion

In their terms, a case is a transaction that is “opened” and “closed” over a period of time: resolve a problem, settle a claim, or fulfill a request. There may be many different types of participants required to complete the case, and a variety of content and data involved.

Similar to the approach of other vendors, OpenText equates “case management” with “vertical application development” to a certain extent, and getting to case handling quickly needs a blueprint to quick-start solution development. To that end, they provide an accelerator as part of Process Suite that includes a pre-defined case model and entities to provide a starting point for developing a case management application, particularly incident management or service requests. Essentially, it’s a sample app/template, albeit a well-structured one that can easily be modified for actual solutions; they have no illusions that this is going to be an out-of-the-box solution for anyone, but rather a guide for people creating new case management applications so that they don’t need to start from scratch.

If you refer back to the more complete description of AppWorks Low Code that I gave in the previous post, they have defined entities, forms, layouts and a case lifecycle that fit a wide variety of request-style case management applications.

Smith then gave us a demonstration of People Center — similar to what we saw her do on the main stage on Tuesday — and discussed how they used the case management accelerator as a starting point for developing the People Center application. They used some parts of the template pretty much as is — such as the request creation form — but made it specific to HR management and extended the capabilities to suit, including a dashboard specific to each role. Checklists and options are specific to the HR application, but as discussed in previous posts, those will persist through an upgrade of the underlying People Center application.

She also walked us through the case management accelerator in the development environment, showing the fairly complete set of entities, forms, layouts, action bars, lists, relationships, rules, email templates, BPM processes, roles and other objects, as well as how easy it is to modify them for your own use. For any partners in the audience, or even customer developers, this will resonate as a method of quickly creating a fully-customized application based on the template that addresses a specific vertical functionality.

OpenText Process Suite becomes AppWorks Low Code

“What was formerly known as Process Suite is AppWorks Low Code, since it has always been an application development environment and we don’t want the focus to be on a single technology within in.”  – Dana Khoyi, architect of OpenText’s Process Suite

That pretty much sums up the biggest BPM positioning/branding announcement at OpenText Enterprise World 2017 this week. BPM is dead, long live low-code application development? Note that AppWorks is the name used for all OpenText developer tools; the technical developer APIs and access points, plus this low code product which is really a separate product.

Khoyi and Kelli Smith (who did the main stage People Center demo on Tuesday) led a session on the last day of Enterprise World to show how Process Suite AppWorks is used to create applications, starting with defining composite entities (business objects made up of multiple pieces of data), then UI constructs including forms, dashboards and lists. Because process and content are built into the environment, there are easy building blocks for content lifecycle, activity flow and history. Declarative rules are supported — triggered on conditions, events or user actions — and dropping out to a full process model for more complex flows and events. They also have a development framework for building customizable applications that persists customizations separately from the application and merges them at runtime, allowing a new version of the core application to be installed without discarding the previous customizations, although obviously you’d want to test and might require some minor retrofits.

Application development starts by defining the core entity for the application (think process or case instance class) then add properties (data fields) and building blocks: forms to edit and display those properties (as well as built-in properties such as state); lists that can be worklists or reporting artifacts; and layouts, which are essentially the application UI screens and can include the previously-created forms plus actions, breadcrumbs, and related content. Data/content security and access/update conflicts are handled automatically on the forms/layouts based on underlying security definitions. Apps that are created can be published immediately to run; these can be moved as packages between testing and production environments although it’s not clear that there’s any versioning or automation around that, so likely some manual governance is required.

Other building blocks that can be added to an application include:

  • A history log that maintains a complete audit trail of everything done during the instance including field-level data changes
  • A discussion for collaborative chat/comments on an instance
  • Content, which can be files/folders that are attached to the case instance using a local document store or other content store via a connector or CMIS, or a businses workspace within Content Server (using Extended ECM) which stores the content in CS and allows access from either environment while syncing properties between them.
  • Email templates that provide a form letter email capability for inbound/outbound email associated with the case
  • Three ways of managing work:
    • Lifecycle, which is a state machine-oriented view (i.e., milestones and the actions required to move between states) for a simple case workflow
    • BPM, for a full drop to the BPMN editor for complex process flows
    • Action flow, which is a simple sequence flow
  • Mobile app creation
  • Entity relationships

There’s a lot of stuff in here, and we didn’t see it all in this short session, but looks like a pretty robust environment for low-code development. Khoyi stated explicitly that this is becoming the development for all OpenText products, replacing the workflow capabilities in Content Server and Documentum.