Column 2

Cleaning up the deadwood…dead links, that is

I’ve been writing Column 2 for almost 13 years, and there’s quite a bit of crud that’s accumulated. I’ve also been seeing some performance problems that are completely out of line with the amount of traffic on the site, so doing some tuning as well.

Please be patient if you see any glitches on this site as well as my corporate website while I complete the following:

Moved to the more modern Twenty Sixteen WordPress theme, which is supposed to have better performance than the Twenty Thirteen theme that I was using. I’ve also replaced JPG graphics on the page design with much smaller GIF graphics.
Use Cloudflare as a Content Delivery Network (CDN) to cache all images from the site plus a lot of the content to help reduce load. This only includes images that are stored on my WordPress site, not those embedded from Flickr, but should help the load on the site as well as loading performance.
Added CAPTCHAs to certain countries and IP ranges that were pummeling the site for content scraping/indexing. If you’re in one of those, you’ll need to click a “I am not a robot” checkmark.
Enforced SSL (https). This was a bit of a process, since I had to track down all of the internal links and embedded objects that used http. If I link to your site and it’s http, that will still work but I really recommend that you update to SSL. I may just change http:// to // to provide a protocol-relative URL, which means that the site will map through to https if it exists, or http otherwise, which will be a bit more future-proof.
Added an EU Cookie Law banner, where you are notified that this site generates cookies, and you need to accept that to dismiss the banner. I don’t explicitly place cookies, but some of the WordPress services and embedded objects do. To my knowledge, there isn’t anything that’s particularly nefarious in there.
Remove the “links” posts: these were older posts generated from delicious and other link saving services. I haven’t been posting these since some time in 2010, when Twitter took over this type of sharing, and many of the links were dead.
Strip out the worst of the dead links. I’m using a broken link checker to find the most common of these (usually when a company changes its URL or ceases to exist) and will gradually get rid of them. This is a longer term project, I’ll keep combing through to find them in my spare time but will likely only fix up the past couple of years.
Replace the old Flickr Flash-based slideshow plugin with the newer embed code from Flickr. I tried using different plugins but they just don’t work as well; the only disadvantages of the direct Flickr embed is that the slideshow doesn’t auto-advance, you have to click on it to move forwards and backwards through the images, plus it has some wonky sizing sometimes when the images are of different sizes. I’m also gradually moving the screen snapshots over from my personal Flickr account to a dedicated Column 2 Flickr account, although the process of cleaning up the related links within posts is a bit of a pain.
Removed other old Flash embeds, such as the original method of embedding a Slideshare presentation.
SEO tuning through better use of post tags.

My goal is to create a faster, cleaner experience for readers with a minimum of clutter. If there are other tools that you’d like to see on the site, let me know: I’ve initially set it with search, top posts and categories for navigation.

Vega Unity 7: productizing ECM/BPM systems integration for better user experience and legacy modernization

I recently had the chance to catch up with some of my former FileNet colleagues, David Lewis and Brian Gour, who are now at Vega Solutions and walked me through their Unity 7 product release. Having founded and run a boutique ECM and BPM services firm in the past, I have a soft spot for the small companies who add value to commercial products by building integration layers and vertical solutions to do the things that those products don’t do (or don’t do very well).

Vega focuses on enterprise content and process automation, primarily for financial and government clients. They have some international offices – likely development shops, based on the locations – and about 150 consultants working on customer projects. They are partners with both IBM and Alfresco for ECM and BPM products for use in their consulting engagements. Like many boutique services firms, Vega has developed products in the course of their consulting engagements that can be used independently by customers, built on the underlying partner technology plus their own integration software:

Vega Interchange, which takes one of their core competencies in content migration and creates an ETL platform for moving content and processes between any of a number of systems including Documentum, Alfresco, OpenText, four flavors of IBM, and shared folders on file systems. Content migration is typically pretty complex by the time you consider metadata and permissions mappings, but they also handle case data and process instances, which is rarely tackled in migration scenarios (most just recommend that you keep the old system alive long enough for all instance to complete, or do manual migration). Having helped a lot of companies think about moving their content and process management systems to another platform, I know that this is one of those things that sounds mundane but is actually difficult to do well.
Vega Unity, billed as a digital transformation platform; we spent most of our time talking about Unity 7, their latest release, which I’ll cover in more detail below.
Vertical solutions for insurance (underwriting, claims, financial operations), government (case management, compliance) and banking (onboarding, loan origination and servicing, wealth management, card dispute resolution).

Unity 7 is an integration and application development tool that links third-party content and process systems, adding a consistent user experience layer and consolidated analytics. Vega doesn’t provide any of the back-end systems, although they partner with a couple of the vendors, but provide tools to take that heterogeneous desktop environment and turn it into a single user interface. This has a significant value in simplifying the user environment, since they only need to learn one system and some of the inter-system integration is automated behind the scenes, but it’s also of benefit for replacing one or more of the underlying technologies due to legacy modernization or technology consolidation due to corporate acquisition. This is what systems integrators have been doing for a long time, but Unity makes it into a product that also leverages the deep system knowledge that they have from their Interchange product. Vega can add Unity to simplify an existing environment, or come in on a net-new ECM/BPM implementation that uses one of their partner technologies plus their application development/integration layer. The primary use cases are federated enterprise content search (where content is indexed in Unity Intelligence engine, including semantic searches), case management applications, and creating legacy modernization by creating a new front end on legacy systems to allow these to be swapped out without changing the user environment.

Unity is all about rapid development that includes case-based applications, content management, data and analytics. As we walked through the product and sample applications, there was definitely a strong whiff of FileNet P8 in here (a system that I used to be very familiar with) since the sample was built with IBM Case Manager under the covers, but some nice additions in terms of unified interface and analytics.

Their claim is that the Unity Case Manager would look the same regardless of the underlying technology, which would definitely make it easier to swap out or federate content, case and process management systems behind the scenes. In the sample shown, since IBM Case Manager was primary, the case view was derived directly from IBM CM case data with the main document list from IBM FileNet P8, while the “Other Documents” tab showed related documents from Alfresco. Dynamic foldering can combine content from different systems into common folders to reduce this visual dichotomy. There are role-based views based on the user profile that provide access to data from multiple systems – including CRM and others in addition to ECM and BPM – and federate it into business objects than can include records, virtual folder structures and related objects such as people or claims. Individual user credentials can be passed to the underlying systems, or shared credentials can be used in connectors for retrieving unrestricted information. Search templates, system connectors and a variety of properties are set in a configuration console, making it straightforward to set up and modify standard operations; since this is an XML-based declarative environment, these configuration changes deploy immediately. The ability to make different types of configuration changes is role-based, meaning that some business users can be permitted to make changes to the shared user interface if desired.

Unity Intelligence adds a layer of visual analytics that aggregates data from the underlying systems and other sources; however, this isn’t just visualization, but can be used to filter work and take action on cases directly via action popup menus or opening cases directly from the analytics interface. They’re using open source tools such as SOLR (search), Lucene (information retrieval) and D3 visualization with good effect: I saw a demo of a Sankey diagram representing the workflow through cases based on realtime data that provided a sort of process mining view of work in progress, and allowed selecting dates for past views of work including completed cases. For case management, in which processes are semi-structured (at best), this won’t necessarily show process anomalies, but can show service interruptions and opportunities for process improvement and standardization.

They’ve published a video showing more about Unity 7 Intelligence, as well as one showing Unity Semantics for creating pivot tables for faceted search on content repositories.

Column 2 wrapup for 2017

As the year draws to an end, I’m taking a look at what I wrote here this year, and what you were reading.

I had fewer posts this year since I curtailed a lot of my conference travel, but still managed to publish 40 posts. I covered a few conferences – Big Data Toronto, OpenText Enterprise World, ABBYY Technology Summit, TIBCO NOW (as an uninvited gate-crasher) and some local AIIM seminars – and a variety of technology topics including BPM (or DPA/digital business as the terminology changes), low code, RPA, case management, decision management and capture.

Inexplicably, the two most read posts this year were one from 2007 on policies, procedures, processes and rules, and one from 2011 on BPMS pricing transparency. The most popular posts that were written this year were from OpenText Enterprise World, plus the page that I published listing the books and journals to which I’ve contributed.

Although US-based readers are the largest group by far, there was also a lot of traffic from India, Canada, Germany, UK and Australia, with many other countries contributing smaller amounts of traffic.

I also made some technical improvements: the site is now more secure via https, and uses Cloudflare to enforce security and fend off some of the spam bots that were killing performance, which has resulted in the use of CAPTCHAs for some IP ranges and countries.

Thanks to all of you for reading and commenting this year, and I look forward to engaging in 2018.

Happy New Year!

A Perfect Combination: Low Code and Case Management

The paper that I wrote on low code and case management has just been published – consider it a Christmas gift! It’s sponsored by TIBCO, and you can find it here (registration required).

This is an accompaniment to the webinar that I did recently with Roger King and Nicolas Marzin, which is available for replay on demand.

Fun times with low code and case management

I recently held a webinar on low code and case management, along with Roger King and Nicolas Marzin of TIBCO (TIBCO sponsored the webinar). We tossed aside the usual webinar presentation style and had a free-ranging conversation over 45 minutes, with Nicolas giving a quick demo of TIBCO’s Live Apps in the middle.

Although preparing for a webinar like this takes just as long as a standard presentation, it’s a lot more fun to participate. I also think it’s more engaging for the audience, even though there’s not as much visual material; I created some slides with a few points on the topics that we planned to cover, including some fun graphics. I couldn’t resist including a visual pun about long tail applications. Smile

You can find the playback here if you missed it, or want to watch again. If you watched it live, there was a problem with the audio for the first couple of slides. Since it was mostly me giving some introductory remarks and a quick overview of case management and low code, we just re-recorded that few minutes and fixed the on-demand version.

I’m finishing up a white paper for TIBCO on case management and low code, stressing that not only is low code the way to go for building case management applications, but that a case management paradigm is the best fit for low code applications. We should have that in publication shortly, so stay tuned. If you attended the webinar, you should receive a link to the paper when it’s published.

Machine learning in ABBYY FlexiCapture

Chip VonBurg, senior solutions architect at ABBYY, gave us a look at machine learning in FlexiCapture 12. This is my last session for ABBYY Technology Summit 2017; there’s a roadmap session after this to close the conference, but I have to catch a plane.

He started with a basic definition of machine learning: a method of data analysis that automates analytical model building, allowing computers to find insights in data and execute logic without being explicitly programmed for where to look or what to do. It’s based on pattern recognition and computational statistics, and it’s popping up in areas such as biology, search and recommendations (e.g., Netflix), and spam detection. Machine learning is an iterative process that uses sample data and one or more machine learning algorithms: the training data set is used by the algorithm to build an analytical model, which is then applied to attempt to analyze or classify new data. Feedback on the correctness of the model for the new data is fed back to refine the learning and therefore the model. In many cases, users don’t even know that they’re providing feedback to train machine learning: every time you click “Spam” on a message in Gmail (or “Not Spam” for something that was improperly classified), or thumbs up/down for a movie in Netflix, you’re providing feedback to their machine learning models.

He walked us through several different algorithms, and their specific applicability: Naive Bayes, Support Vector Machine (SVM), and deep learning; then a bit about machine learning scenarios inclunition rulesding supervised, unsupervised and reinforcement learning. In FlexiCapture, machine learning can be used to sort documents into categories (classification), and for training on field-level recognition. The reason that this is important for ABBYY customers (partners and end customers) is that it radically compresses the time to develop the rules required for any capture project, which typically consumes most of the development time. For example, instead of just training a capture application for the most common documents since that’s all you have time for, it can be trained for all document types, then the model will continue to self-improve as verification users correct errors made by the algorithm.

Although VonBurg was unsure if the machine learning capabilities are available yet in the SDK — he works in the FlexiCapture application team, which is based on the same technology stack but runs independently — the session on robotic information capture yesterday seems to indicate that it is in the SDK, or will be very soon.

Capture microservices for BPO with iCapt and ABBYY

Claudio Chaves Jr. of iCapt presented a session at ABBYY Technology Summit on how business process outsourcing (BPO) operations are improving efficiencies through service reusability. iCapt is a solutions provider for a group of Brazilian companies, including three BPOs in specific verticals, a physical document storage company, and a scanner distributor. He walked through a typical BPO capture flow — scan, recognize, classify, extract, validate, export — and how each stage can be implemented using standalone scan products, OCR SDKs, custom UIs and ECM platforms. Even though this capture process only outputs data to the customer’s business systems at the end, such a solution needs to interact with those systems throughout for data validation; in fact, the existing business systems may provide some overlapping capabilities with the capture process. iCapt decided to turn this traditional capture process around by decoupling each stage into independent, reusable microservices that can be invoked from the business systems or some other workflow capability, so that the business system is the driver for the end-to-end capture flow. The microservices can be invoked in any order, and only the ones that are required are invoked. As independent services, each of them can be scaled up and distributed independently without having to scale the entire capture process.

The recognize, classify and extract steps are typically unattended, and became immediate candidates to be implemented as microservices. This allows them to be reusable across processes, scaled independently, and deployed on-premise or in the cloud. For example, a capture process that is used for a single type of document doesn’t require the classification service, but only uses the recognize and extract services; another process that uses all three may reuse the same recognize and extract services when it encounters the same type of document as the first process handles, but also uses the classify service to determine the document type for heterogeneous batches of documents. iCapt is using ABBYY FineReader as a core component in their iCaptServices Cloud offering, embedded within their own web APIs that offer higher-level services on top of the FRE core functions; the entire package can be deployed as a container or serverless function to be called from other applications. They provide services for mobile client development to allow these business applications to have capture on mobile devices.

He gave an example of a project that they did for recovering old accounting records by scanning and recognizing paper books; this was a one-time conversion project, not an ongoing BPO operation, making it crucial that they be able to build the data capture application quickly without developing an excessive amount of custom code that would have been discarded after the 10-week project duration. They’re currently using the Windows version of ABBYY which increases their container/cloud costs somewhat, and are interested in trying out the Linux version that we heard about yesterday.

Pairing @UiPath and ABBYY for image capture within RPA

Andrew Rayner of UiPath presented at the ABBYY Technology Summit on robotic process automation powered by ABBYY’s FineReader Engine (FRE). He started with a basic definition of RPA — emulating human execution of repetitive processes with existing applications — and the expected benefits in high scalability and reduction in errors, costs and cycle time. RPA products work really well with text on the screen, copying and pasting data between applications, and many are using machine learning to train and improve their automated actions so that it’s more than the simpler old-school “screen scraping” that was dependent purely on field locations on the screen.

What RPA doesn’t do, however, is work with images; that’s where ABBYY FRE comes in. UiPath provides developers using their UiPath Studio the ability to OCR images as part of the RPA flow: an image is passed to FineReader for recognition, then an XML data file of the recognized data is returned in order to complete the next robotic steps. Note that “images” may be scanned documents, but can also be virtualized screens that don’t transfer data fields directly, just display the screen as an image, such as you might have with an application running in Citrix — this is a pretty important capability that is eluding standard RPA.

Rayner walked through an example of invoice processing (definitely the most common example used in all presentations here, in part because of ABBYY’s capabilities in invoice recognition): UiPath grabs the scanned documents and drops them in a folder for ABBYY; FRE does the recognition pass and creates the output XML files as well as managing the human verification step, including applying machine learning on the human interaction to continuously improve the recognition as we heard about yesterday; then finally, UiPath pushes the results into SAP for completing the payment process.

For solution developers working with RPA and needing to integrate data captured from images or virtualized screens, this is a pretty compelling advantage for UiPath.

ABBYY mobile real-time recognition

Dimitry Chubanov and Derek Gerber presented at the ABBYY Technology Summit on ABBYY’s mobile real-time recognition (RTR), which allows for recognition directly on a mobile device, rather than just capturing content to pass on to a back-end recognition server. Mobile data capture comes in two basic flavors: first, the mobile user is just entering data, such as an account number or password; and second, the mobile user is entering both data and image, such as personal data and a copy of their ID.

ABBYY RTR isn’t based on taking a photo and then running recognition on that image; instead, it uses several frames of image from the camera preview stream and runs recognition algorithms on the stream without having to capture an image. This provides a better user experience since the recognition results are immediate and they don’t have to type the data manually, and better privacy since no image is captured to the phone or passed to any other device or server. They demonstrated this using a sample app on an iPhone; it’s interesting to see the results changing slightly as the phone moves around, since the recognition is happening using the previous several frames of video data, and it gradually gains recognition confidence after a few seconds of video. We saw recognition of unstructured paragraphs of text, drivers licenses, passports and bank cards. The SDK ships with a lot of predefined document types, or you can create your own by training for specific fields using location and regular expressions. They are also offering the ability to capture meter data, such as electricity meters, although some of this requirement is being by smart meters and other IoT advances.

They also have a mobile imaging SDK that can capture an image when it’s needed — for proof of ID, for example — with scene stabilization, document edge detection, deskewing and various types of image enhancement to capture the optimal photo for downstream storage and processing.

I can imagine, for example, a mobile airline app that needs to capture your passport information using mobile RTR to grab the data directly rather than having you type it in. I’ve also seen something very similar used to capture the unique number from an iTunes gift card directly into the App Store on an iPhone. Just like QR code reading is now built right into the search bar on the mobile versions of Google Chrome, and Google Translate on mobile allows real-time capture of text using the same camera preview mode (plus simultaneous translation), being able to capture text from a printed source instead of requiring a mobile user to type it in is likely to become ubiquitous in mobile apps.

ABBYY Robotic Information Capture applies machine learning to capture

Back in the SDK track at ABBYY Technology Summit, I attended a session on “robotic information capture” with FlexiCapture Engine 12, with lead product manager Andrew Zyuzin and director of product marketing Semyon Sergunin showing some of the automation classification and data extraction capabilities powered by machine learning. Traditional enterprise capture uses manually-created rules for classification and data extraction to set up for automated capture: a time-consuming training process up front in order to maximize recognition rates. At the other end of the spectrum, robotic process automation uses machine learning to analyze user actions, and create classification and extraction algorithms that can be run by robots to replace human operators. In the Goldilocks middle, they position robotic information capture as a blending of these two ideas: the system is pre-trained and processes standard documents out of the box, then uses machine learning to enhance the recognition for non-standard documents by analyzing how human operators handle the exceptions. Although I’m not completely aligned with their use of the term robotic process automation since RPA is not completely synonymous with machine learning and also isn’t limited to capture applications, I understand why they’re positioning their ML-assisted capture as robotic information capture as a middle ground between traditional capture and ML-assisted RPA.

We saw a demo of this with invoice capture: a PDF invoice was processed through their standard invoice recognition, detecting vendor name and invoice number, but the wrong number was picked up for the total amount due to the location of the field. This was corrected by a user in the verification client, and the information of where to find the total was analyzed for retraining and fed back to the recognition model. The user doesn’t know that they’re actually training the system — there’s no explicit training mode — but it just happens automatically in the background for continuous improvement of the recognition rates, gradually reducing the amount of manual verification. After the training was fed back, we saw another invoice from the same vendor processed, with the invoice total field properly detected.

Although I think that most technology is pretty interesting, this is the first thing I’ve seen today that made me say “cool!”

Zyuzin also walked us through their advanced classification, which can classify documents without any development based on large data sets of typical document types such as invoices, cheques, and drivers licences; automatic classification is important as the front end to recognition so that the correct recognition techniques and templates can be applied. Their advanced classification uses both image and content classification, that is, determines what type of document it is based on how it looks as well as the available text content. He showed us a demo of processing a package of mortgage documents, where there is a large number of possible documents that can be submitted by a consumer as supporting documentation; most of the documents were properly classified, but a few were unrecognized and required a quick setup of a new document type to train the classifier. This was more of a manual training process, but once the new document class was created, it could be applied to other unrecognized documents in the package.