Category Archives: ECM

enterprise content management

Cloud ECM with @l_elwood @OpenText at AIIM Toronto Chapter

Lynn Elwood, VP of Cloud and Services Solutions at OpenText, presented on managing information in a cloud world at today’s AIIM chapter meeting in Toronto. This is of particular interest to Canadians, since most of the cloud service offerings that we see are in the US, and many companies are not comfortable with keeping their private data in a jurisdiction where it can be somewhat easily exposed to foreign government and intelligence agencies.

She used a building analogy to talk about cloud services:

  • Infrastructure as a service (IaaS) is like a piece of serviced land on which you need to build your own building and worry about your connections to services. If your water or electricity is off, you likely need to debug the problem yourself although if you find that the problem is with the underlying services, you can go back to the service provider.
  • Platform as a service (PaaS) is like a mobile home park, where you are responsible for your own dwelling but not for the services, and there are shared services used by all residents.
  • Software as a service (SaaS) is like a condo building, where you own your own part of it, but it’s within a shared environment. SaaS by Gartner’s definition is multi-tenant, and that’s the analogy: you are at the whim, to a certain extent, of the building management in terms of service availability, but at a greatly reduced cost.
  • Dedicated, hosted or managed is like a private house on serviced land, where everything in the house is up to you to maintain. In this set of analogies, not sure that there is a lot of distinction between this and IaaS.
  • On-premises is like a cottage, where you probably need to deal with a lot of the services yourself, such as water and septic systems. You can bring in someone to help, but it’s ultimately all your responsibility.
  • Hybrid is a combo of things — cloud to cloud, cloud to on-premise — such as owning a condo and driving to a cottage, where you have different levels of service at each location but they share information.
  • Managed services is like having a property manager, although it can be cloud or on-premise, to augment your own efforts (or that of your staff).

Regardless of the platform, anything that touches the cloud is going to have a security consideration as well as performance/up-time SLAs if you want to consider it as part of your core business. From my experience, on-premise solutions can be just as insecure and unstable as any cloud offering, so good to know what you’re comparing with when you are looking at cloud versus on-premise.

Most organziations require that their cloud provider have some level of certification: of the facility (data centre), platform (infrastructure) and service (application). Elwood talked about the cloud standards that impact these, including ISO 27001, and SOC 1, 2 and 3.

A big concern is around applications in the cloud, namely SaaS such as Box or Salesforce. Although IT will be focused on whether the security of that application can be breached, business and information managers need to be concerned about what type of data is being stored in those applications and whether it potentially violates any privacy regulations. Take a good look at those SaaS EULAs — Elwood took us through some Apple and Google examples — and have your lawyers look at them as well if you’re deploying these solutions within the enterprise. You also need to look at data residency requirements (as I mentioned at the start): where the data resides, the sovereignty of the hosting company, the routing between you and the data even if the data resides in your own country, and the backup policies of the hosting company. The US Patriot Act allows the US government to access any data that passes through, is stored in, or is hosted by a company that is domiciled in the US; other countries are also adding similar laws. Although a company may have a data centre in your country, if they’re a US company, they probably have a default to store/process/backup in the US: check our the Microsoft hosting and data processing agreement, for example, which specifies that your data will be hosted and/or processed in the US unless you explicitly request otherwise. There’s an additional issue that even if your data has the appropriate residency, if an employee is travelling to a restricted country and accesses the data remotely, you may be violating privacy regulations; not all applications have the ability to filter otherwise authenticated access based on IP address. If you add this to the ability of foreign governments to demand device passwords in order to enter a country, the information accessible via an employee’s computer — not just the information stored it — is at risk for exposure.

Elwood showed a map of the information governance laws and regulations around the world, and it’s a horrifying mix of acronyms for data protection and privacy rules, regulated records retention, eDiscovery requirements, information integrity and authenticity, and reporting obligations. There’s a new EU regulation — the General Data Protection Regulation (GDPR) — that is going to be a game-changer, harmonizing laws across all 28 member nations and applying to any data collected about an EU citizen even outside the EU. The GDPR includes increased consent standards, stronger individual data rights, stronger breach notification, increased governance obligation, stronger recordkeeping requirements, and data transfer constraints. Interestingly, Canada is recognized as one of the countries that is deemed to have “adequate protection” for data transfer, along with Andorra, Argentina, the Faroe Islands, the Channel Islands (Guernsey and Jersey), Isle of Man, Israel, New Zealand, Switzerland and Uruguay. In my opinion, many companies aren’t even aware of the GDPR, much less complying with it, and this is going to be a big wake-up call. Your compliance teams need to be aware of the global landscape as it impacts your data usage and applications, whether in the cloud or on premise; companies can receive huge fines (up to 4% of annual revenue) for violating GDPR whether they are the “owner” of the data or just a data processor/host.

OpenText has a lot of GDPR information on their website that is not specific to their products if you want to read more. 

There are a lot of benefits to cloud when it comes to information management, and a lot of things to consider: agility to grow and change quickly; a services approach that requires partnering with the service provider; mobility capabilities offered by cloud platforms that may not be available for on premise; and analytics offered by cloud vendors within and across applications.

She finished up with a discussion on the top areas of concerns for the attendees: security, regulations, GDPR, data sovereignty, consumer applications, and others. Great discussion amongst the attendees, many of whom work in the Canadian financial services industry: as expected, the biggest concerns are about data residency and sovereignty. GDPR is seen as having the potential to level the regulatory playing field by making everyone comply; once the data centres and service providers start to comply, it will be much easier for most organizations to outsource that piece of their compliance by moving to cloud services. I think that cloud service providers are already doing a better job at security and availability than most on-premise systems, so once they crack the data residency and sovereignty problem there is little reason to have a private data centre. IT’s concern has mostly been around security and availability, but now is the time for information and compliance managers to get involved to ensure that privacy regulations are supported by these platforms.

There are Canadian companies using cloud services, even the big banks and government, although I am guessing that it’s for peripheral rather than core services. Although some are doing this “accidentally” as the only way to share information with external participants, it’s likely time for many companies to revisit their information management strategies to see if they can be more inclusive of property vetted cloud solutions.

We did get a very brief review of OpenText and their offerings at the end, including their software solutions and their EIM cloud offerings under the OpenText Cloud banner. They are holding their Enterprise World user conference in Toronto this July, which is the first (but likely not the last) big software company to see the benefits of a non-US North American conference location.

AIIM Toronto seminar: @jasonbero on Microsoft’s ECM

I’ve recently rejoined AIIM — I was a member years ago when I did a lot of document capture and workflow implementation projects, but drifted away as I became more focused on process — and decided to check out this month’s breakfast seminar hosted by the AIIM Toronto chapter. Today’s presenter was Jason Bero from Microsoft Canada, who is a certified records manager and information governance specialist, talking about SharePoint and related Microsoft technologies that are used for classification, preservation, protection and disposal of information assets.

He started out with AIIM’s view of the stages of information management (following diagram found online but almost certainly copyright AIIM) as a framework for describing where SharePoint fits in and their new functionality:

There’s a shift happening in information management, since a lot of information is now created natively in electronic form, may be generated by customers and employees using mobile apps, and even stored outside the corporate firewaall on cloud ECM platforms. This creates challenges in authentication and security, content privacy protection, automatic content classification, and content federation across platforms. Microsoft is adding data loss prevention (DLP) and records management capabilities to SharePoint to meet some of these challenges, including:

  • Compliance Policy Center
  • DLP policies and management
  • Policy notification messages
  • Document deletion policies
  • Enhanced retention and disposition policies for working documents
  • Document- and records-centric workflow with a web-based workflow design tool
  • Advanced e-discovery for unstructured documents, including identifying relevant relationships between documents
  • Advanced auditing, including SharePoint Online and OneDrive for Business as well as on-premise repositories
  • Data governance: somewhat controversially (at my table of breakfast colleagues, anyway), this replaces the use of metadata fields with a new “tags” concept
  • Rights management on documents that can be applied to both internal and external viewers of a document

AIIM describes an information classification and protection cycle: classification, labeling, encryption, access control, policy enforcement, document tracking, and document revocation; Bero described how SharePoint addresses these requirements, with particular attention paid to Canadian concerns for the local audience, such as encryption keys. I haven’t looked at SharePoint in quite a while (and I’m not really much of an ECM expert any more), but it looks like lots of functionality that boosts SharePoint into a more complete ECM and RM solution. This muscles in on some of the territory of their ISV partners who have provided these capabilities as SharePoint add-ons, although I imagine that a lot of Microsoft customers are lingering on ancient versions of SharePoint and will still be using those third-party add-ons for a while. In organizations that are pure Microsoft however, the ways that they can integrate their ECM/RM capabilites across all of their information creation, management and collaboration tools — from Office 365 to Skype For Business — provides a seamless environment for protecting and managing information.

He gave a live demo of some of these capabilites at work, showing how the PowerPoint presentation that he used would be automatically classified, shared, protected and managed based on its content and metadata, and the additional manual overrides that can be applied such as emailing him when an internal or external collaborator opens the document. Documents sent to external participants are accompanied by Microsoft Rights Management, providing the ability to see when and where people open the document, limiting or allowing downloads and printing, and allowing the originator to revoke access to the document. [Apparently, it’s now highly discouraged to send emails with attachments within Microsoft, which is a bit ironic considering that bloated Outlook pst files due to email attachments is the scourge of many IT departments.] Some of their rights management can be applied to non-Microsoft repositories such as Box, although this required a third-party add-on.

There was a question about synchronous collaborative editing of documents: you can now do this with shared Office documents using a combination of the desktop applications and browser apps, such that you see other people’s changes in the documents in real time while you’re editing it (like Google Docs), without requiring explicit check-out/check-in. I assume that this requires that the document is stored in a Microsoft repository, either on-premise or cloud, but that’s still an impressive upgrade.

One of the goals in this foray by Microsoft into more advanced ECM is to provide capabilities that are automated as much as possible, and generally easy-to-use for anything requiring manual input. This allows records management to happen on the fly by everyday users, rather than requiring a lot of intervention by trained records management people or complex custom workflows, and to have DLP policies applied directly within the tools that people are already using for creating, managing and sharing documents. Given the dominance of Microsoft on the desktop of today’s businesses, and the proliferation of SharePoint, a good way to improve compliance with better control over information assets.

AIIM Toronto seminar: FNF Canada’s data capture success

Following John Mancini’s keynote, we heard from two of the sponsors, SourceHOV and ABBYY. Pam Davis of SourceHOV spoke about EIM/ECM market trends, based primarily on analyst reports and surveys, before giving an overview of their BoxOffice product.

ABBYY chose to give their speaking slot to a customer, Anjum Iqbal of FNF Canada, who spoke about their capture and ECM projects. FNF provides services to financial institutions in a variety of lending areas, and deals with a lot of faxed documents. A new business line would have their volume move to 4,500 inbound faxes daily, mostly time-sensitive documents, such as mortgage or loan closing, that need to be processed within an hour of receipt. To do this manually, they would have needed to increase their 4 full time staff to 10 people handle the inbound workflow even at a rate of 1 document/minute; instead, they used ABBYY FlexiCapture to build a capture solution for the faxes that would extract the data using OCR, and interface with their downstream content and workflow systems without human intervention. The presentation went by pretty quickly, but we learned that they had a 3-month implementation time.

I stayed on for the roundtable that ABBYY hosted, with Iqbal giving more details on their implementation. They reached a tipping point when the volume of inbound printed faxes just couldn’t be handled manually, particularly when they added some new business lines that would increase their volume significantly. Unfortunately, the processes involving the banks were stuck on fax technology — that is, the banks refused to move to electronic transfer rather than faxes — so they needed to work with that fixed constraint. They needed quality data with near-zero error rates extracted from the faxes, and selected ABBYY and one of their partners to help build a solution that took advantage of standard form formats and 100% machine printing on the forms (rather than handwriting). The forms weren’t strictly fixed format, in that some critical information such as mortgage rates may be in different places on the document depending on the other content of the form; this requires a more intelligent document classification as well as content analytics to extract the information. They have more than 40 templates that cover all of their use cases, although still need to have one person in the process to manage any exceptions where the recognition certainty was below a certain percentage. Given the generally poor quality of faxed documents, undoubtedly this capture process could also handle documents scanned on a standard business scanner or even a mobile device in addition to their current RightFax server. Once the data is captured, it’s formatted as XML, which their internal development team then used to integrate with the downstream processes, while the original faxes are stored in a content management system.

Given that these processes accept mortgage/loan application forms and produce the loan documents and other related documentation, this entire business seems ripe for disruption, although given the glacial pace of technology adoption in Canadian financial services, this could be some time off. With the flexible handling of inbound documents that they’ve created, FNF Canada will be ready for it when it happens.

That’s it for me at the AIIM Toronto seminar; I had to duck out early and missed the two other short sponsor presentations by SystemWare and Lexmark/Kofax, as well as lunch and the closing keynote. Definitely worth catching up with some of the local people in the industry as well as hearing the customer case studies.

AIIM Toronto keynote with @jmancini77

I’m at the AIIM Toronto seminar today — I pretty much attend anything that is in my backyard and looks interesting — and John Mancini of AIIM is opening the day with a talk on business processes. Yes, Mr. Column 1 is talking about Column 2, if you get the Zachman reference. This is actually pretty significant: content management isn’t just about content, just as process management isn’t just about process, but both need to overlap and work together. I had a call with Mancini yesterday in advance of my keynote at ABBYY’s conference next month, and we spent 30 minutes talking about how disruption in capture technologies has changed all business processes. Today, in his keynote, he talked about disruptive business processes that have transformed many industries.

John Mancini at AIIM TorontoHe gave us a look at people, process and technology against the rise (and sometimes fall) of different technology platforms: document management and workflow; enterprise content management; mobile and cloud. There are a lot of issues as we move from one type of platform to another: moving to a cloud SaaS offering, for example, drives the move from perimeter-based security to asset-based security. He showed a case study for financial processes within organizations — such as accounts payable and receivable — with both a tactical dimention of getting things done and a strategic side of building a bridge to digital transformation. Most businesses (especially traditional ones) operate at a slim profit margin, making it necessary to look at ways to reduce costs: not through small, incremental improvements, but through more transformational means. For financial processes, in many cases this means getting rid of manual data capture and manipulation: no more manual data entry, no more analysis via spreadsheets. And cost reduction isn’t the key driver behind transforming financial business processes any more: it’s the need for better business analytics. Done right, these analytics provide real-time insight into your business that provide a strategic competitive differentiator: the ability to predict and react to changing business conditions.

Mancini finished by allowing today’s sponsors, with booths around the room, to introduce themselves: Precision ContentAIIMBoxPanasonicSystemWareABBYYSourceHOV, and Lexmark (Kofax). I’ll be here for the rest of the morning, and look forward to hearing from some of the sponsors and their customers here today.

Join the AIIM paper-free pledge

Pledge_badge1AIIM recently posted about the World Paper-Free Day on November 6th, and although I’m not sure that it’s recognized as a national holiday or anything, it’s certainly a good idea. I blogged almost three years ago about my mostly paperless office, and how to achieve such a thing yourself. Since that time, I’ve added an Epson DS-510 scanner, which has a nice small footprint and a sheet feeder; it sits right on my desk and there is never a backlog of scanning.

It’s not just about scanning and shredding, although those are pretty important activities: you have to have a proper retention plan that adheres to any regulatory requirements, and a secure offsite (cloud or otherwise) backup capability to ameliorate any physical site disasters.

You also need to consider how much backfile conversion that you’ll do: I decided to back-scan everything except my financial records at the time that I started going completely paperless, then scan everything including financials from that date forward. Each year, another batch of old paper financial records reached their destruction date and were shredded, the last of them just last year, and I no longer have any paper files. If back-scanning is too time-consuming for you but you want to start scanning everything day-forward, then store your old paper files by destruction date so that you can easily shred the batch of expired files each year until there are none left.

These things – scanning, document destruction, retention plan, secure backup, backfile conversion – are the same things that I’ve dealt with at large enterprise customers in the past on ECM projects, just on a small-office scale.

IBM ECM Strategy at Content2015

Wrapping up the one-day IBM Content 2015 mini-conference in Toronto (repeated in several other cities across North America) is Feri Clayton, director of document imaging and capture. Feri and I were two of the few female engineers at FileNet back during my brief time there in 2000-1, and I have many fond memories of our “women in engineering” lunch club of three members.

Clayton talked about how enterprises are balancing the three key imperatives of staying competitive through productivity and cost savings, increasing growth through customer centricity, and protecting the organization through security and compliance. With ECM initiatives, this boils down to providing the right information to employees and customers to allow them to make the right decisions at the right time. From and ECM capabilities standpoint, this requires the critical capabilities of content capture, content protection, activating content by putting it into business processes, analyzing content to reveal insights, and engaging people in content-centric processes and collaboration. Some recent advances for IBM: they have been moving towards a single unified UI for all of their ECM portfolio, and IBM Content Navigator now provides a common modern user experience across all products; they have also been recognized as a market leader in Case Management by the big analysts.

She did a pretty complete review of the entire ECM portfolio, including recent major releases as well as what’s coming up.

Looking forward, they’re continuing to improve Navigator Cloud (hosted ECM), advancing mobile capture and other document capture in Datacap, releasing managed cloud (IBM hosted) offerings for CMOD and Case Manager, and releasing a new Information Lifecycle Governance solution. They’re also changing their release cadence, moving to quarterly releases rather than the usual 1-2 years between releases, while making the upgrades much easier so that they don’t require a lot of regression testing.

IBM Navigator Cloud — the cloud ECM product, not the unified UI — has a new mobile UI and a simplified web UI that includes external file sharing; soon it will have a Mac sync client, and an ECM solution platform on the cloud codenamed “Galaxy” that provides for much faster development using solution patterns. There’s quite an extensive ECM mobile roadmap, with Case Manager and Datacap coming soon on mobile. The core content platform continues to be enhanced, but they’re also expanding to integrate with web-based editors such as Office 365 and Google Docs, and enhancing collaboration for external participants.

Case Manager, which is my key product of interest here today, will soon see a mobile interface (or app?), enhanced case analytics, enhanced property layout editor, simplified solution deployment and packaging, and more industry and vertical solutions. Further out, they’re looking at hybrid use cases with cloud offerings.

Good summary of the IBM ECM roadmap, and a wrap for the day.

IBM ECM and Cloud

I’m at the IBM Content 2015 road show mini-conference in Toronto today, and sat in on a session with Mike Winter (who I know from my long-ago days at FileNet prior to its acquisition by IBM) discussing ECM in the cloud.

The content at the conference so far has been really lightweight: I think that IBM sees this more as a pre-sales prospecting presentation than an actual informational conference for existing customers. Although there is definitely a place for the former, it should not necessarily be mixed with the latter; it just frustrates knowledgeable customers who were really looking for more product detail and maybe some customer presentations.

ECM in the cloud has a lot of advantages, such as being able to access content on mobile devices and share with external parties, but also has a lot of challenges in terms of security — or, at least, perceived security — when implementing in larger enterprise environments. IBM ECM has a very robust and granular security/auditing model that was already in place for on-premise capabilities; they’re working to bring that same level of security and auditing to hybrid and cloud implementations. They are using the CMIS content management standard as the API into their Navigator service for cloud implementation: their enhanced version of CMIS provides cloud access to their repositories. The typical use case is for a cloud application to access an ECM repository that is either on premise or in IBM’s SoftLayer managed hosting in a sync-and-share scenario; arguably, this is not self-provisioned ECM in the cloud as you would see from cloud ECM vendors such as Box, although they are getting closer to it with per-user subscription pricing. This is being rolled out under the Navigator brand, which is a bit confusing since Navigator is also the term used for the desktop UI. There was a good discussion on user authentication for hybrid scenarios: basically, IBM replicates the customers’ LDAP on a regular basis, and is moving to do the same via a SAML service in the future.

Winter gave us a quick demo of the cloud (hosted) Navigator running against a repository in Amsterdam: adding a document, adding tags (metadata) and comments, viewing via an HTML5 viewer that includes annotations, and more. Basically, a nice web-based UI on an IBM ECM repository, with most of the rich functionality exposed. It’s quick to create a shared teamspace and add documents for collaboration, and create simple review workflows. He’s a tech guy, so didn’t know the SLA or the pricing, although he did know that the pricing is tiered.

Activiti BPM Suite – Sweet!

There are definitely changes afoot in the open source BPM market, with both Alfresco’s Activiti and camunda releasing out-of-the-box end-user interfaces and model-driven development tools to augment their usual [Java] developer-friendly approach. In both cases, they are targeting “citizen developers”: people who have technical skills and do some amount of development, but in languages lighter weight than Java. There are a lot of people who fall into this category, including those (like me) who used to be hard-core developers but fell out of practice, and those who have little formal training in software development but have some other form of scientific or technical background.

Prior to this year, Activiti BPM was not available as a standalone commercial product from Alfresco, only bundled with Alfresco or as the community open source edition; as I discussed last year, their main push was to position Activiti as the human-centric workflow within their ECM platform. However, Activiti sports a solid BPMN engine that can be used for more than just document routing and lifecycle management, and in May Alfresco released a commercially-supported Alfresco Activiti product, although focused on the human-centric BPM market. This provides them with opportunities to monetize the existing Activiti community, as well as evolving the BPM platform independently of their ECM platform, such as providing cloud and hybrid services; however, it may have some impact on their partners who were relying on support revenue for the community version.

The open source community engine remains the core of the commercial product – in fact, the enterprise release of the engine lags behind the community release, as it should – but the commercial offering adds all of the UI tools for design, administration and end-user interface, plus cluster configuration for the execution engine.

Activiti Administrator cluster monitoringThe Activiti Administrator is an on-premise web application for managing clusters, deploying process models from local packages or the Activiti Editor, and technical monitoring and administration of in-flight processes. There’s a nice setup wizard for new clusters – the open source version requires manual configuration of each node – and allows nodes within the cluster to be auto-discovered and monitored. The monitoring of process instances allows drilling into processes to see variables, the in-flight process model, and more. Not a business monitoring tool, but seems like a solid technical monitoring tool for on-premise Activiti Enterprise servers.

The Activiti Editor is a web-based BPMN process modeling environment that is a reimplementation of other open-source tools, refactored with JavaScript libraries for better performance. The palette can be configured based on the user profile in order to restrict the environment, which would typically be used to limit the number of BPMN objects available for modeling in order to reduce complexity for certain business users to create simple models; a nice feature for companies that want to OEM this into a larger environment. Models can be shared for comments (in a history stream format), versioned, then accessed from the Eclipse plug-in to create more technical executable models. Although I saw this as a standalone web app back in April, it is now integrated as the Visual Editor portion of Kickstart within the Activiti Suite.

Activiti SuiteThe Activiti Suite is a web application that brings together several applications into a single portal:

  • Kickstart is their citizen development environment, providing a simple step editor that generates BPMN 2.0 – which can then be refined further using the full BPMN Visual Editor or imported into the Eclipse-based Activiti Designer – plus a reusable forms library and the ability to bundles processes into a single process application for publishing within the Suite. In the SaaS version, it will integrate with cloud services including Google Drive, Alfresco, Salesforce, Dropbox and Box.
  • Tasks is the end-user interface for starting, tracking and participating in processes. It provides an inbox and other task lists, and provides for task collaboration by allowing a task recipient to add others who can then view and comment on the task. Written in Angular JS.
  • Profile Management to , for user profile and administration
  • Analytics, for process statistics and reports.

The Suite is not fully responsive and doesn’t have a mobile version, although apparently there are mobile solutions on the way. Since BP3 is an Activiti partner, some of the Brazos tooling is available already, and I suspect that more mobile support may be on the way from BP3 or Alfresco directly.

They have also partnered with Fluxicon to integrate process mining, allowing for introspection of the Activiti BPM history logs; I think that this is still a bit ahead of the market for most process analysts but will make it easy when they are ready to start doing process discovery for bottlenecks and outliers.

I played around with the cloud version, and it was pretty easy to use (I even found a few bugs Smile ) and it would be usable by someone with some process modeling and lightweight development skills to build apps. The Step Editor provides a non-BPMN flowcharting style that includes a limited number of functions, but certainly enough to build functional human-centric apps: implicit process instance data definition via graphical forms design; step types for human, email, “choice” (gateway), sub-process and publishing to Alfresco Cloud; a large variety of form field types; and timeouts on human tasks (although timers based on business days, rather than calendar days, are not there yet). The BPMN Editor has a pretty complete palette of BPMN objects if you want to do a more technical model that includes service tasks and a large variety of events.

Although initially launched in a public cloud version, everything is also available on premise as of the end of November. They have pricing for departmental (single-server up to four cores with a limit on active processes) and enterprise (eight cores over any number of servers, with additional core licensing available) configurations, and subscription licensing for the on-premise versions of Kickstart and Administrator. The cloud version is all subscription pricing. It seems that the target is really for hybrid BPM usage, with processes living on premise or in the cloud depending on the access and security requirements. Also, with the focus on integration with content and human-centric processes, they are well-positioned to make a play in the content-centric case management space.

Instead of just being an accelerator for adding process management to Java development projects, we’re now seeing open source BPM tools like Activiti being positioned as accelerators for lighter-weight development of situational applications. This is going to open up an entire new market for them: an opportunity, but also some serious new competition.

Spotfire Content Analytics At TIBCONOW

(This session was from late yesterday afternoon, but I didn’t remember to post until this morning. Oops.)

Update: the speakers were Thomas Blomberg from TIBCO and Rik Tamm-Daniels from Attivio. Thanks, guys!

I went to the last breakout on Monday to look at the new Spotfire Content Analytics, which combines Spotfire in-memory analytics and visualization with Attivio content analysis and extraction. This is something that the ECM vendors (e.g., IBM FileNet) have been offering for a while, and I was interested to see the Spotfire take on it.

Basically, content analytics is about analyzing documents, emails, blogs, press releases, website content and other human-created textual data (also known as unstructured content) in order to find insights; these days, a primary use case is to determine sentiment in social media and other public data, in order for a company to get ahead of any potential PR disasters.

Spotfire Content Analytics — or rather, the Attivio engine that powers the extraction — uses four techniques to find relative information in unstructured content:

  • Text extraction, including metadata
  • Key phrase analysis, using linguistics to find “interesting” phrases
  • Entity extraction, identifying people, companies, places, products, etc.
  • Sentiment analysis, to determine degree of negative/positive sentiment and confidence in that score

Once the piece of content has been analyzed to extract this relevant information, more traditional analytics can be applied to detect patterns, tie these back to revenue, and allow for handling of potential high-value or high-risk situations.

Spotfire Content Analytics (via their ) uses machine learning that allows you to train the system using sample data, since the information that is considered relevant is highly dependent on the specific content type (e.g., a tweet versus a product review). They provide rich text analytics, seamless visualization via Spotfire, agility through combining sources and transformations, and support for diverse content sources. They showed a demo based on a news feed by country from the CIA factbook site (I think), analyzing and showing aggregate sentiment about countries: as you can imagine, countries experiencing war and plague right now aren’t viewed very positively. Visualization using Spotfire allows for some nice geographic map-based searching, as well as text searching. The product will be available later this month (November 2014).

Great visualizations, as you would expect from Spotfire; it will be interesting to see how this measures up to IBM’s and other content analytics offerings once it’s released.

AIIM Information Chaos Rescue Mission – Toronto Edition

AIIM is holding a series of ECM-related seminars across North America, and since today’s is practically in my back yard, I decided to check it out. It’s a free seminar so heavily sponsored; most of the talks are from the sponsor vendors or conversations with them, but John Mancini kicked things off and moderated mini-panels with the sponsor speakers to tease out some of the common threads.

The morning started with John Mancini talking about disruptive consumer technologies — cloud, mobile, IoT — and how these are “breaking” our internal business processes by fragmenting the channels and information sources. The result is information chaos, where information about a client lives in multiple places and often can’t be properly aggregated and contextualized, while still remaining secure. Our legacy systems, designed to be secure, were put in place before the devices that are causing security leaks were even invented; those older systems can’t even envision all the ways that information can leak out of an organization. Furthermore, the more consumer technologies advance, the further behind our IT people seem, making it more likely that business users will just go outside/around IT for what they need. New technologies need to be put in the context of our information management practices, and those practices adjusted to include the disruptors, rather than just ignore them: consider how to minimize risk in this information chaos state;  how to use information to engage and collaborate, rather than just shutting it away in a vault; how to automate processes that involve information that may not be stored in an ECM; and how to extract insights from this information.

A speaker from Fujitsu was up next, stating some interesting statistics on just how big the information chaos problem is:

  • 50% of business documents are still on paper; most businesses have many of their processes still reliant on paper.
  • Departmental CM systems have proliferated: 75% of organizations with a CM system have more than one, and 25% have more than four. SharePoint is like a virus among them, with an estimated 50% of organizations worldwide using SharePoint ostensibly for collaboration, but usually for ad hoc content management.
  • Legacy CM systems are themselves are a hidden source of costs, inefficiency and risk.

In other words, we have a lot of problems to tackle still: large organizations tend to have a lot of non-integrated content management systems; smaller organizations tend to have none at all.

We finished the first morning segment with an introduction from the event sponsors at small booths around the room:

An obvious omission (to me, anyway) was IBM/FileNet — not sure why they are not here as a sponsor considering that they have a sizable local contingent.

The rest of the morning was taken up with two sets of short vendor presentations, each followed by a Q&A session moderated by John Mancini: first Epson, K2 and EMC; then KnowledgeLake, HP Autonomy, Kodak alaris and OpenText. There were audience questions about information security and risk, collaboration/case management, ECM benefits and change management, auto-classification, SharePoint proliferation, cloud storage, managing content retention and disposal, and many other topics; lots of good discussions from the panelists. I was amazed (or maybe just sadly accepting) at the number of questions dealing with paper capture and disposal; I’ve been working in scanning/workflow/ECM/BPM since the late 80’s, and apparently there are still a lot of people and processes resistant to getting rid of paper. As a small business owner, I run a paperless office, and have spent a big chunk of my career helping much larger enterprises go paperless as part of streamlining their processes, so I know that this is not only possible, but has a lot of benefits. As one of the vendors pointed out, just do something, rather than sitting frozen, surrounded by ever-increasing piles of paper.

I skipped out at lunchtime and missed the closing keynote since it was the only bit remaining after the lunch break, although it looked like a lot of the customer attendees stayed around for the closing and the prize draws afterwards, plus to spend time chatting with the vendors.

Thanks to AIIM and the sponsors for today’s seminar; the presentations were a bit too sales-y for me but some good nuggets of information. There’s still one remaining in Chicago and one in Minneapolis coming up next week if you want to sign up.