process mining – Page 5

Wrapping Up BPM2010

I’m off on a week’s vacation now, then to speak at the IRM BPM conference in London the week of September 27th, but I wanted to give a final few notes on the BPM 2010 conference that happened this week.

The conference was hosted by Michael zur Muehlen at Stevens Institute in Hoboken, NJ: the first time that it’s been held in North America, and the first time (to my knowledge) that it’s had an industry track in addition to the usual research track. This allowed many people to attend – academics, practitioners, vendors and analysts – who might not normally be able to attend a European conference; that has raised the awareness of the conference significantly, and should help to continue its success in the future. Michael did a great job of hosting us (although I don’t think that he slept all week), with good logistics, good food and great evening entertainment in addition to the outstanding lineup of presenters.

I attended the research track, since it gives me a glimpse of where BPM will be in five years. The industry track, as good as it was, contains material that I can see at any of the several other BPM industry conferences that I attend each year. I started out in the BPM and Social Software workshop on Monday, then attended presentations on business process design, people and process, BPM in practice and BPM in education. Collaboration continues to be a huge area of study, fueled by the entry of many collaborative BPM products into the marketplace in the past year.

A key activity for me this week (which caused me to miss the process mining sessions, unfortunately) was the first organizational meeting for the Process Knowledge body of knowledge (BoK, and yes, I know that the two K’s are a bit redundant). Based on research from the Queensland University of Technology into what’s missing from current BoKs, a small group of us are getting the ball rolling on an open source/Creative Commons BoK, with a wide variety of contributors, and freely available for anyone to repurpose the material. I published an initial call to action, and Brenda Michelson, who is chief cat-herder in this effort, added her thoughts.

BPM 2011 will be in southern France the week of August 29th, so mark your calendar.

BPM and Social Software Workshop

I attended the workshop on BPM and social software yesterday, but somehow didn’t get it together to actually blog about it. The workshop chairs, Selmin Nurcan and Rainer Schmidt, organized a good program of presentations covering a wide variety of topics in social software and BPM:

Combining Social Software and BPM, by Rainer Schmidt
Implicit Social Production, by Ben Jennings
Evolutive Vocabulary for Collaborative BPM Discussions, by David Martinho
Declarative Configurable Process Specifications for Adaptive Case Management, by Selmin Nurcan
Emergent Case Management for Ad-hoc Processes, by Martin Böhringer
Merging Social Software with Business Process Support, by Ilia Bider
Processpedia, by António Silva
Social Software for Coordination of Collaborative Process Activities, by Frank Dengler
Empowering Business Users to Model and Execute Business Processes, originally to be presented by Florian Schnabel but with a last-minute replacement whose name I didn’t catch

I picked up a few interesting nuggets during the day. In Martinho’s presentation, he discussed the gaps – in skills, concerns and lanuage – between business users, business modelers and developers, and some ideas for overcoming this in collaboration on process modifications. By allowing users to associate tags and other information to the underlying process models as they are interacting with a step in that process through a business application, this information and interactions between users at that point can be analyzed and fed back into the formal process design stage for eventual incorporation into the process model. I’ve seen a bit of this in practice, both in BPM (e.g., Pega’s SmartBPM V6) and in other types of software (e.g., Google’s Feedback Extension), and think that this model for allowing a user to feed back directly on their view of an activity rather than a potentially unfamiliar process model should be a feature in all enterprise software.

Silva’s presentation on Processpedia also makes me want to go back and read his paper in more detail. It is also concerned with collaboration on process models and how to address the different perspectives of different types of stakeholders. He advocates that not all process instance variations need to be described by a formal (standard) model: this is the underlying message in the dynamic BPM capabilities that we’re seeing in many commercial BPM systems. He also concludes that process instance deviations capture tacit knowledge from which abstractions can emerge, which is the core behind a lot of process mining research, too. The key is that there be no separation between the collaboration and working environments, that is, collaboration happens directly in the business users’ applications, not in some other tool or representation with which they might be unfamiliar.

Another Call for Papers: Americas Conference on Information Systems

Although it’s very well-hidden on the information site, the 16th Americas Conference on Information Systems, to be held in Lima in August, will have a mini-track on BPM (it’s within the Systems Analysis and Design track):

This mini-track seeks contributions that discuss the management of business processes as well as technologies for process automation. We encourage submissions from both a
managerial as well as a technical perspective.

Suggested topics include, but are not limited to, the following:

-Business process automation and workflow management systems
-Business process and rule modeling, languages and design patterns
-Strategies for business process design and innovation
-Service-oriented architectures for BPM
-Resource management and capacity planning in BPM
-Information security and assurance in BPM
-Business process monitoring and controlling
-Process mining and its applications
-Business process governance, risk and compliance management
-Management of adaptive and flexible processes
-Management of ad-hoc and collaboration processes
-Management of knowledge-intensive processes
-Formal evaluation of BPM methods and technologies
-BPM adoption and critical success factors
-BPM maturity
-Standardization of BPM, web services and workflow technology
-Industry case studies on BPM technology or BPM applications

March 1st is the submission deadline for papers.

Deciding on process modeling tools #GartnerBPM

Bill Rosser presented a decision framework for identifying when to use BPA (business process analysis), EA (enterprise architecture) and BPM modeling tools for modeling processes: all of them can model processes, but which should be used when?

It’s first necessary to understand why you’re modeling your processes, and the requirements for the model: these could be related to quality, project validation, process implementation, as part of a larger enterprise architecture modeling effort and many other reasons. In the land of BPM, we tend to focus on modeling for process implementation because of the heavy focus on model-driven development in BPMS, hence model within our BPMS, but many organizations have other process modeling needs that are not directly related to execution in a BPMS. Much of this goes back to EA modeling, where several levels of process modeling that occur in order to fulfill a number of different requirements: they’re all typically in one column of the EA framework (column 2 in Zachman, hence the name of this blog), but stretch across multiple rows of the framework such as conceptual, logical and implementation.

Different types and levels of process models are used for different purposes, and different tools may be used to create those models. He showed a very high-level business anchor model that shows business context, a conceptual process topology model, a logical process model showing tasks within swimlanes, and a process implementation model that looked very similar to the conceptual model but included more implementation details.

As I’ve said before, introspection breeds change, and Rosser pointed out that the act of process modeling reaps large benefits in process improvement since the process managers and participants can now see and understand the entire process (probably for the first time), and identify problem areas. This premise is what’s behind many process modeling initiatives within organizations: they don’t plan to build executable processes in a BPMS, but model their processes in order to understand and improve the manual processes.

Process modeling tools can come in a number of different guises: BPA tools, which are about process analysis; EA tools, which are about processes in the larger architectural context; BPM tools, which are about process execution; and process discovery tools, which are about process mining. They all model processes, but they provide very different functionality around that process model, and are used for different purposes. The key problem is that there’s a lot of overlap between BPA, EA and BPM process modeling tools, making it more difficult to pick the right kind of tool for the job. EA tools often have the widest scope of modeling and analysis capabilities, but don’t do execution and tend to be more complex to use.

He finished by matching up process modeling tools with BPM maturity levels:

Level 1, acknowledging operational inefficiencies: simple process drawing tools, such as Visio
Level 2, process aware: BPA, EA and process discovery tools for consistent process analysis and definition of process measurement
Levels 3 and 4, process control and automation: BPMS and BAM/BI tools for execution, control, monitoring and analysis of processes
Levels 5 and 6, agile business structure: simulation and integrated value analysis tools for closed-loop connectivity of process outcomes to operational and strategic outcomes

He advocates using the simplest tools possible at first, creating some models and learning from the experience, then evaluating more advanced tools that cover more of the enterprise’s process modeling requirements. He also points out that you don’t have to wait until you’re at maturity level 3 to start using a BPMS; you just don’t have to use all the functionality up front.

Divide-and-Conquer Strategies for Process Mining #BPM2009

In the first of two papers in the final session of the conference, Josep Carmona of Universitat Politecnica de Catalunya presented on process mining calculation strategies. The theory of regions shows how to derive a Petri net representation of a process model from the process log, which shows the transition between states, but it’s very computationally expensive. This paper deals with ways of making that computation less expensive in order to deal effectively with large logs.

First is a decompositional strategy, which partitions the regions in a way that allows the identification of a set of state machines that cover all the events, then uses parallel composition to assemble the state machines into a Petri net.

The second approach is a higher-level divide-and-conquer strategy, where the event log is recursively partitioned by event class until the log sections are small enough to use other techniques. The clustering of the events is the key thing here: first, compute the causal dependency graph, then use spectral graph theory to find clusters of highly related events that will be partitioned off into their own section of the event log.

What they’ve seen in experiments using this technique is that there is a significant computational improvement (from minutes to seconds) from the decompositional approach, and that the divide-and-conquer approach allows for the processing of event logs that are just too large for other techniques.

You can get Genet, the tool that they developed to do this, here.

Discovering process models from unlabelled event logs #BPM2009

Diogo Ferriera of Universidade Tecnica de Lisboa presented a paper on process discovery based on unlabelled event logs: where the events in the log are only identified by the specific task, not by the process instance. Consider that a process instance may be executed via multiple paths through the process model, resulting in a different sequence of events logged: although you might know all possible paths through the model, you don’t know which one that any given instance followed. Also consider that processes will be executing simultaneously, so their events are intermingled.

Taking a probabilistic approach, you can take the event sequence and the source sequence (i.e., you know both the events and the instances that created them) and generate a matrix of probabilities that any given event will follow another within the same source instance: that’s some fairly standard math. He then took the event sequence and the matrix (which now represents a priori knowledge about how events interrelated), and did a fairly magic-looking calculation that calculated the source sequence based on that information.

The problem, of course, is that you don’t have the magic matrix, you only have the event sequence: initialize the matrix to something, then use the event sequence and the matrix to estimate the source sequence, then use the event sequence and the estimated source sequence to estimate the matrix. Wash, rinse, repeat until the matrix converges. You could initialize the matrix randomly, but that would take a while to converge (or would converge to a local maximum); instead, Ferreira pulled a rabbit out of his hat by stating that the matrix can be initialized with the transition probabilities present in the event sequence, that is, as if the event sequence were generated from a single source. As the paper states:

Even if x [event sequence] is the result of interleaving a number of sources, their underlying behaviour will be present in M+ [probability matrix] since consistent behaviour will stand out with stronger transition probabilities than the spurious effects of random interleaving. Therefore, M+ is a good initial guess for the estimation of M.

Amazingly, this works. Or maybe not so amazing, because I suppose there would be no research paper if this didn’t work. As the number of sources (instances) increases, the accuracy approaches that of when both the event and source sequence are known; as the number of overlapping sources increases, the accuracy drops to about 60% (by the time that you reach 10 overlapping instances), then flattens out.

There are a number of use cases for this: preprocessing for other process mining algorithms, or as a labeling mechanism to find the source instance when it is unknown. Or just if you want to show off some cool math at parties.

Tutorial: enabling flexibility in process-aware information systems #BPM2009

Manfred Reichert of Ulm University and Barbara Weber of University of Innsbruck presented a tutorial on the challenges, paradigms and technologies involved in enabling flexibility in process-aware information systems (PAIS). Process flexibility is important, but you have to consider both build time flexibility (how to quickly implement and configure new processes) and run time flexibility (how to deal with uncertainty and exceptional cases during execution), as well as their impact on process optimization.

We started by looking at the flexibility issues inherent in the imperative approach to BPM, where pre-defined process models are deployed and executed, and the execution logs monitored (in other words, the way that almost all BPMS work today). As John Hoogland discussed this morning, there are a number of flexibility issues at build time due to regional process variations or the lack of a sufficient information about decisions to build them into the process model. There’s also flexibility issues in the run time, mostly around exception handling and the need for ad hoc changes to the process. As all this rolls back in to the process analyst through the execution monitoring, it can be used to optimize the process model, which requires flexibility in evolving the process model and impacting work in progress. The key problem is that there are way too many variants in most real-life processes to realistically model all of them: there needs to be a way to model a standard process, then allow user-driven configuration (either explicitly or based on the instance parameters) at run time. The Provop approach presented in the tutorial allows for selective enabling and disabling of process steps in a master model based on the instance conditions, with a lot of the research based on the interaction between the parameters and the soundness of the resultant models.

Late binding and late modeling approaches use a pre-specified business process with one or more placeholder activities, then the placeholder activities are replaced with a process fragment at run time either from a pre-determined set of process fragments or a process fragment assembled by the user from existing activity templates (the latter is called the “pockets of flexibility” approach, a name that I find particularly descriptive).

Up to this point, the focus has been on changes to the process model to handle variability that are part of normal business, but possibly not considered exceptions. Next, we looked at runtime exception handling, such as the failure or unavailability of a web service that causes the normal process to halt. Exceptions that are expected (anticipated) can be handled with compensation, with the compensation events and handler built into the process model; unexpected exceptions may be managed with ad hoc process changes to that executing instance. Ad hoc process changes can be a bit tricky: they need to be done at a high level of abstraction in order to make it understandable to the user making the change, yet the correctness of the changes must be validated before continuing. This ability needs to be constrained to a subset of the users, and the users who can make the changes may require some assistance to do this correctly.

This was a good tutorial, but I wanted to catch the process mining session so skipped out at the break and missed the last half.

Comparing BPM conferences

The fall conference season has kicked off, and I’ve already had the pleasure of attending 3 BPM conferences: the International BPM conference (academic), Appian’s first user conference (vendor), and the Gartner BPM summit (analyst). It’s rare to have 3 such different conferences crammed into 2 weeks, so I’ll sum up some of the differences that I saw.

The International BPM conference (my coverage) features the presentation of papers by academics and large corporate research labs covering various areas of BPM research. Most of the research represented at the conference is around process modeling in some way — patterns, modularity, tree structures, process mining — but there were a few focused on process simulation and execution issues as well. The topics presented here are the future of BPM, but not necessarily the near future: some of these ideas will likely trickle into mainstream BPM products over the next 5 years. It’s also a very technical conference, and you may want to arm yourself with a computer science or engineering background before you wade into the graph theory, calculus and statistics included in many of these papers. This conference is targeted at academics and researchers, but many of the smaller BPM vendors (the ones who don’t have a big BPM research lab like IBM or SAP) could benefit by sending someone from their architecture or engineering group along to pick up cool ideas for the future. They might also find a few BPM-focused graduate students who will be looking for jobs soon.

Appian’s user conference (my coverage) was an impressive small conference, especially for their first time out. Only a day long, plus another day for in-depth sessions at their own offices (which I did not attend), it included the obligatory big-name analyst keynote followed by a lot of solid content. The only Appian product information that we saw from the stage was a product update and some information on their new partnership with MEGA; the remainder of the sessions was their customers talking about what they’ve done with Appian. They took advantage of the Gartner BPM summit being in their backyard, and scheduled their user conference for earlier the same week so that Appian customers already attending Gartner could easily add on a day to their trip and attend Appian’s conference as well. Well run, good content, and worth the trip for Appian customers and partners.

Gartner’s BPM summit (my coverage), on the other hand, felt bloated by comparison. Maybe I’ve just attended too many of these, especially since they started going to two conferences per year last year, but there’s not a lot of new information in what they’re presenting, and there seems to be a lot of filler: quasi-related topics that they throw in to beef up the agenda. There was a bit of new material on SaaS and BPM, but not much else that caught my interest. Two Gartner BPM summits per year is (at least) one too many; I know that they claim to be doing it in order to cover the east-west geography, but the real impact is that the vendors are having to pony up for two of these expensive events each year, which will kill some of the other BPM events due to lack of sponsorship. Although I still think that the Gartner BPM summit is a good place for newbies to get a grounding in BPM and related technologies, having a more diverse set of BPM events available would help the market overall.

If you’re a customer and have to choose one conference per year, I’d recommend the user conference put on by your BPM vendor — you’ll get enough of the general information similar to Gartner, plus specific information about the product that you’ve purchased and case studies by other customers. If you haven’t made a purchasing decision yet and/or are really new to BPM, then the Gartner BPM summit is probably a better choice, although there are other non-vendor BPM events out there as well. For those of you involved in the technical side of architecting and developing BPM products at vendors or highly sophisticated customers, I recommend attending the International BPM conference.

BPM Milan: From Personal Task Management to End User Driven Business Process Modeling

Todor Stoitsev of SAP Research presented the last of the flexibility and user interaction papers, From Personal Task Management to End User Driven Business Process Modeling. This is based on research about end-user development, but posits that BPMN is not appropriate for end users to work with directly for ad hoc process modeling.

There is quite a bit of related research to this: workflow patterns, ad hoc workflow, email-based workflow, instance-based task research, process mining, and other methods that provide better collaboration with the end users during process modeling. In this case, they’ve based their research on programming by example, where processes are inferred by capturing the activities of process participants. This involves not just the process participants (business users), but also a domain expert who uses the captured ad hoc activities to work towards a process model, which is eventually formalized in concert with a programmer, and turned into formal workflow models. In formalizing ad hoc processes, it’s critical to consider issues such as pattern reuse, and they have built tools for exploring task patterns as well as moving through to process definition, the latter of which is prototyped using jBPM.

As with most of the other papers today, I can’t do justice to the level of technical detail presented here; I’m sure that the proceedings are available in some form, or you can track down the authors for more information on their papers.

BPM Milan: Visual Support for Work Assignment

Massimiliano de Leoni presented a paper on Visual Support for Work Assignment in Process-Aware Information Systems, co-authored by Wil van der Aalst and Arthur ter Hofstede.

This is relevant for systems with worklists, where it may not be clear to the user which work item to select based on a variety of motivations. In most commercial BPMS, a worklist contains just a list of items, each with a short description of some sort; he is proposing a visual map of work items and resources, where any number of maps can be defined based on different metrics. In such a map, the user can select the work item based on its location on the map, which represents its suitability for processing by that user at that time.

He walked us through the theory behind this, then the structure of the visualization framework as implemented. He walked us through an example of how this would appear on a geographic map, which was a surprise to me: I was thinking about more abstract mapping concepts, but he had a geographic example that used a Google map to visualize the location of the resources (people who can process the work item) and work items. He also showed a timeline map, where work items were positioned based on time remaining to a deadline.

Maybe I’m just not a visual person, but I don’t see why the same information can’t be conveyed by sorting the worklist (where the user would then choose the first item in the list as being the highest recommendation), although future research in turning a time-lapse of the maps into a movie for process mining is a cool concept.