Discovering process models from unlabelled event logs #BPM2009

Diogo Ferriera of Universidade Tecnica de Lisboa presented a paper on process discovery based on unlabelled event logs: where the events in the log are only identified by the specific task, not by the process instance. Consider that a process instance may be executed via multiple paths through the process model, resulting in a different sequence of events logged: although you might know all possible paths through the model, you don’t know which one that any given instance followed. Also consider that processes will be executing simultaneously, so their events are intermingled.

Taking a probabilistic approach, you can take the event sequence and the source sequence (i.e., you know both the events and the instances that created them) and generate a matrix of probabilities that any given event will follow another within the same source instance: that’s some fairly standard math. He then took the event sequence and the matrix (which now represents a priori knowledge about how events interrelated), and did a fairly magic-looking calculation that calculated the source sequence based on that information.

The problem, of course, is that you don’t have the magic matrix, you only have the event sequence: initialize the matrix to something, then use the event sequence and the matrix to estimate the source sequence, then use the event sequence and the estimated source sequence to estimate the matrix. Wash, rinse, repeat until the matrix converges. You could initialize the matrix randomly, but that would take a while to converge (or would converge to a local maximum); instead, Ferreira pulled a rabbit out of his hat by stating that the matrix can be initialized with the transition probabilities present in the event sequence, that is, as if the event sequence were generated from a single source. As the paper states:

Even if x [event sequence] is the result of interleaving a number of sources, their underlying behaviour will be present in M+ [probability matrix] since consistent behaviour will stand out with stronger transition probabilities than the spurious effects of random interleaving. Therefore, M+ is a good initial guess for the estimation of M.

Amazingly, this works. Or maybe not so amazing, because I suppose there would be no research paper if this didn’t work. As the number of sources (instances) increases, the accuracy approaches that of when both the event and source sequence are known; as the number of overlapping sources increases, the accuracy drops to about 60% (by the time that you reach 10 overlapping instances), then flattens out.

There are a number of use cases for this: preprocessing for other process mining algorithms, or as a labeling mechanism to find the source instance when it is unknown. Or just if you want to show off some cool math at parties.

Marge Breya on BusinessObjects Explorer #sapphire09

A small group of bloggers had the opportunity to sit around a table with Marge Breya to expand on what we saw during the press conference on BusinessObjects Explorer. She discussed how unstructured data has being elevated to first class status within SAP, with analytics and reporting tools that can lay over unstructured as well as structured data. Part of this involves parsing structure out of unstructured data through an appropriate semantic layer.

They’re also playing with things (that she couldn’t really talk about, although some customers have access) that provide much more of an hosted Web 2.0-type of experience. They’re working on Explorer On Demand, which allows you to upload spreadsheets and other file-oriented data, then do some analysis and visualization on your own data to get an idea of how valuable tools like this are. They handed out some test drive passes for this, so I may get a chance to play around with it some time soon. I expect that many organizations won’t want their data warehouse in the cloud, but this will at least give them a chance to try it out in a no-risk environment. They’re doing this with more of their BusinessObjects platform, where there’s a free version that allows for some starter functionality, then hope for it to go viral in terms of stepping up to paid on demand or on premise versions. That’s a pretty powerful model in the consumer space, although traditional enterprises may have a more difficult time adopting technology in this manner. Considering that the higher-end of Explorer is targeted at large organizations, this could be the biggest challenge.

Breya had some interesting background on product strategy as well, especially around how SAP had traditionally been doing OLAP-based business intelligence, and BusinessObjects didn’t have much in the way of OLAP, so the acquisition produced a minimum of overlap. Polestar, on the market for a couple of years as an ad hoc query tool, was retooled into Explorer for a million or so rows of data, and Explorer Accelerated, a software and hardware bundle, that can handle billions or rows.

She went on to talk about the ties between BI and BPM, and although she couldn’t talk about anything specific, there are some interesting things coming in terms of operational BI, monitoring and characterizing processes for the purposes of process improvement, as well as invoking analytics within processes for decision support.

In response to a question about the consumerization of SAP products, she promises us “an experience that will take decisioning to the next level, involving collaboration” in something that is just entering private beta now. I’m picturing a cross between Xbox Live and Vanilla Sky, which would be cool, but I still think that there are challenges to adoption of completely new user experience paradigms. Since SAP has a wide customer base in manufacturing and other industries with low margins and the requirement for constant product innovation, this may not be as much of a challenge as it would be verticals such as financial services and insurance.

We had a discussion about the cloud versus on premise as the location for data, with the underlying theme that it’s not an all or nothing proposition: while operational data may be behind the firewall, it makes much more sense to leave third-party benchmarking data in the cloud where it can be shared and frequently updated. The new generation of BI products from any vendor can’t be restrictive in their data sources, but have to be able to aggregate information from a variety of sources both inside and outside the firewall.

Innovation World: ChoicePoint external customers solutions with BPM, BAM and ESB

I took some time out from sessions this afternoon to meet with Software AG’s deputy CTOs, Bjoern Brauel and Miko Matsumura, but I’m back for the last session of the day with Cory Kirspel, VP of identity risk management at ChoicePoint (a LexisNexis company), on how they have created externally-facing solutions using BPM, BAM and ESB. ChoicePoint screens and authenticates people for employment screening, insurance services and other identity-related purposes, plus does court document retrieval. There’s a fine line to walk here: companies need to protect the privacy of individuals while minimizing identify fraud.

Even though they only really do two things — credential and investigate people and businesses — they had 43+ separate applications on 12 platforms with various technologies in order to do this. Not only did that make it hard to do what they needed internally, customers were also wanting to integrate ChoicePoint’s systems directly into their own with an implementation time of only 3-4 months, and provide visibility into the processes.

They were already a Software AG customer with the legacy modernization products, so took a look at their BPM, BAM and ESB. The result is that they had better visibility, and could leverage the tools to build solutions much faster since they weren’t building everything from the ground up. He walked us through some of the application screens that they developed for use in their customers’ call centers: allow a CSR to enter some data about a caller, select a matching identity by address, verify the identity (e.g., does the SSN match the name), authenticate the caller with questions that only they could answer, then provide a pass/fall result. The overall flow and the parameters of every screen can be controlled by the customer organization, and the whole flow is driven by a process model in the BPMS which allows them to assign and track KPIs on each step in the process.

They’re also moving their own executives from the old way of keeping an eye on business — looking at historical reports — to the new way with near real-time dashboards. As well as having visibility into transaction volumes, they are also able to detect unusual situations that might indicate fraud or other situations of increased risk, and alert their customers. They found that BAM and BI were misunderstood, poorly managed and under-leveraged; these technologies could be used on legacy systems to start getting benefits even before BPM was added into the mix.

All of this allowed them to reduce the cost of ownership, which protects them in a business that competes on price, as well as offering a level of innovation and integration with their customers’ systems that their competitors are unable to achieve.

They used Software AG’s professional services, and paired each external person with an internal one in order to achieve knowledge transfer.

Business Rules Forum: James Taylor and Neil Raden keynote

Opening the second conference day, James Taylor and Neil Raden gave a keynote about competing on decisions. First up was James, who started with a definition of what a decision is (and isn’t), speaking particularly about operation decisions that we often see in the context of automated business processes. He made a good point that your customers react to your business decisions as if they were deliberate and personal to them, when often they’re not; James’ premise is that you should be making these deliberate and personal, providing the level of micro-targeting that’s appropriate to your business (without getting too creepy about it), but that there’s a mismatch between what customers want and what most organizations provide.

Decisions have to be built into processes and systems that manage your business, so although business may drive change, IT gets to manage it. James used the term “orthogonal” when talking about the crossover between process and rules; I used this same expression in a discussion with him yesterday in discussing how processes and decisions should not be dependent upon each other: if a decision and a process are interdependent, then you’re likely dealing with a process decision that should be embedded within the process, rather than a business decision.

A decision-centric organization is focused on the effectiveness of its decisions rather than aggregated, after-the-fact metrics; decision-making is seen as a specific competency, and resources are dedicated to making those decisions better.

Enterprise decision management, as James and Neil now define it, is an approach for managing and approving the decisions that drive your business:

  • Making the decisions explicit
  • Tracking the effectiveness of the decisions in order to improve them
  • Learning from the past to increase the precision of the decisions
  • Defining and managing these decisions for consistency
  • Ensuring that they can be changed as needed for maximum agility
  • Knowing how fast the decisions must be made in order to match the speed of the business context
  • Minimizing the cost of decisions

Using an airline pilot analogy, he discussed how business executives need a number of decision-related tools to do their job effectively:

  • Simulators (what-if analysis), to learn what impact an action might have
  • Auto-pilot, so that their business can (sometimes) work effectively without them
  • Heads-up display, so they can see what’s happening now, what’s coming up, and the available options
  • Controls, simple to use but able to control complex outcomes
  • Time, to be able to take a more strategic look at their business

Continuing on the pilot analogy, he pointed out that the term dashboard is used in business to really mean an instrument cluster: display, but no control. A true dashboard must include not just a display of what’s happening, but controls that can impact what’s happening in the business. I saw a great example of that last week at the Ultimus conference: their dashboard includes a type of interactive dial that can be used to temporarily change thresholds that control the process.

James turned the floor over to Neil, who dug further into the agility imperative: rethinking BI for processes. He sees that today’s BI tools are insufficient for monitoring and analyzing business processes, because of the agile and interconnected nature of these processes. This comes through in the results of a survey that they did about how often people are using related tools: the average hours per week that a marketing analyst spends using their BI tool was 1.2, versus 17.4 for Excel, 4.2 for Access and 6.2 for other data administration tools. I see Excel everywhere in most businesses, whereas BI tools are typically only used by specialists, so this result does not come as a big surprise.

The analytical needs of processes are inherently complex, requiring an understanding of the resources involved and process instance data, as well as the actual process flow. Processes are complex causal systems: much more than just that simple BPMN diagram that you see. A business process may span multiple automated (monitored) processes, and may be created or modified frequently. Stakeholders require different views of those processes; simple tactical needs can be served by BAM-type dashboards, but strategic needs — particularly predictive analysis — are not well-served by this technology. This is beyond BI: it’s process intelligence, where there must be understanding of other factors affecting a process, not just measuring the aggregated outcomes. He sees process intelligence as a distinct product type, not the same as BI; unfortunately, the market is being served (or not really served) by traditional query-based approaches against a relatively static data model, or what Neil refers to as a “tortured OLAP cube-based approach”.

What process intelligence really needs is the ability to analyze the timing of the traffic flow within a process model in order to provide more accurate flow predictions, while allowing for more agile process views that are generated automatically from the BPMN process models. The analytics of process intelligence are based on the process logs, not pre-determined KPIs.

Neil ended up by tying this back to decisions: basically, you can’t make good decisions if you don’t understand how your processes work in the first place.

Interesting that James and Neil deal with two very important aspects of business processes: James covers decisions, and Neil covers analytics. I’ve done presentations in the past on the crossover between BPM, BRM and BI; but they’ve dug into these concepts in much more detail. If you haven’t read their book, Smart Enough Systems, there’s a lot of great material in there on this same theme; if you’re here at the forum, you can pick up a copy at their table at the expo this afternoon.

Ultimus: Process optimization

Chris Adams is back to talk to us about process optimization, both as a concept and in the context of the Ultimus tools available to assist with this. I’m a bit surprised with the tone/content of this presentation, in which Chris is explaining why you need to optimize processes; I would have thought that anyone who has bought a BPMS probably gets the need for process optimization.

The strategies that they support:

  • Classic: updating your process and republishing it without changing work in progress
  • Iterative: focused and more specific changes updating live process instances
  • Situational/temporary: managers changing the runtime logic (really, the thresholds applied using rules) in live processes, such as changing an approval threshold during a month-end volume increase
  • Round-trip optimization: comparing live data against modeling result sets in simulation

There’s a number of tools for optimizing and updating processes:

  • Ultimus Director, allowing a business manager to change the rules in active processes
  • Studio Client, the main process design environment, which allows for versioning each artifact of a process; it also allows changes to be published back to update work in progress
  • iBAM, providing visibility into work in progress; it’s a generic dashboarding tool that can also be used for visualization of other data sets, not just Ultimus BPM instance data

He finished up with some best practices:

  • Make small optimizations to the process and update often, particularly because Ultimus allows for the easy upgrade of existing process instances
  • Use Ultimus Director to get notifications of
  • Use Ultimus iBAM interactive dials to allow executives to make temporary changes to rule thresholds that impact process flow

There was a great question from the audience about the use of engineering systems methodology in process optimization, such as theory of constraints; I don’t think that most of the vendors are addressing this explicitly, although the ideas are creeping into some of the more sophisticated simulation product.

Ultimus: Reports and Dashboards

Chris Adams is probably now thinking that I’m stalking him: not only do I attend his first two technical sessions, but when he switches to the business track for this presentation, I follow him. However, I wanted to hear about their reporting and analytics capabilities, and he covered off reporting, dashboards, BAM, alerts and using third-party analytics.

Ultimus test drive

He started out with the underlying premise that you need to have governance over your business data, or your processes won’t be effective and efficient; in order to do that, you need to identify the key performance indicators (KPIs) that will be used to measure the health of your processes. This means both real-time monitoring and historical analytics.

Ultimus iBAM provides a real-time dashboard that works with both V7 and V8. Only in V8, there’s also email alerts when specific KPI thresholds are reached.

For offline reporting, they have three types:

  • Process reports, automatically created for process instance analytics
  • User reports, also automatically created for workload and user productivity
  • Custom reports that allow for filtering of the historical data, filtered by other business data

Reports can be viewed as charts as well as tabular reports; there is a third-party report generation tool invisibly built in (Infologistics?); Chris noted that this is the only third-party OEM component in Ultimus.

If you’re using Crystal Reports or Cognos, Ultimus has now opened up and created connectors to allow for reporting on the Ultimus history data directly from those platforms; by the end of the year, they’ll add support for SQL Server Reporting Services as well.

There will be a more technical session on the reporting and analytics later today.

Business Objects Summit closing Q&A

Jonathan Becher hosted a wrap-up Q&A with Doug Merritt, Marge Breya and Sanjay Poonen. I’ve consolidated the responses rather than attributing them to the individuals:

  • On reasons for Business Objects’ continued growth: major contributors include having the SAP sales force also selling Business Objects products, and expansion of the product suite to include GRC and EPM. Also, synergy of two leaders in different markets coming together to create something bigger than the sum of the parts.
  • On portfolio roadmap for products being sunsetted or merged (a.k.a. the stuff that I wasn’t allowed to blog about earlier): it’s probably accurate to summarize that some of the SAP BI products will be discontinued but the customers will be migrated to appropriate Business Objects products, and there will be a few products that are merged.
  • On the growth of on-demand BI, expect to see some of the Business Objects applications (as opposed to just the platforms) offered using a SaaS model, although there’s nothing definite being discussed here.
  • On the link between BI and business rules, which hasn’t really been mentioned explicitly today: operational BI is part of their portfolio, and they’re working on ways to integrate more closely with BPM, BAM and decisioning.
  • On open source: they’re not seeing stress from open source products so are working on making their current successful OEM strategy work for them rather than considering releasing open source products.

After the panel, Becher did a summary about closing the gap between strategy and execution, and the trends that are driving innovation in business intelligence:

  • Unified information, moving from structured information generated within the four walls of the organization, to structured and unstructured and internal and external information
  • Collaborative decisions, moving from individual contributors within functional silos, to teams collaborating and communicating across boundaries
  • Business network optimization, from point relationships with customers and suppliers, to a dynamic network of partners

Business Objects’ goal: to transform the way the world works by connecting people, information and businesses. A bit ambitious, but they believe that bringing together BI, EPM and GRC is truly transformational.

That’s it for the Business Objects Influencer Summit; I’m staying on here tomorrow for the SAP SME day and will continue blogging then.

Fireside chat with Doug Merritt at Business Objects Summit

Keeping with SAP’s excellent blogger relations, a few of us bloggers had a chance for a quick chat with Doug Merritt about acquisitions in the space, the “walking dead” of BI vendors, popular BI applications, SaaS BI, go-to-market strategies, the transition to being part of SAP, new product segments, selling into big accounts versus mid-market, the challenges of distribution, and recent maintenance fee increases. Interesting stuff.

This is my first Business Objects event, and I’m still getting used to hearing the insiders refer to it as “bob-j” (presumably from the pre-acquisition ticker symbol).

Business Objects Summit: Franz Aman on BI Platform

Franz Aman, VP of Product Marketing, gave us a product roadmap of the BI platform within Business Objects and SAP. Unfortunately, he declared the session as being under NDA, even though a lot of what he talked about had nothing to do with future product directions, so I can’t share it with you.

The true innovation, which I hope that I’m not breaking NDA to report on, is the use of a background gradient that goes from SAP yellow to Business Objects blue in the boxes that represent products jointly developed by SAP and Business Objects:

Secret SAP-Business Objects background

Shhhh…you didn’t see it here.