Divide-and-Conquer Strategies for Process Mining #BPM2009

In the first of two papers in the final session of the conference, Josep Carmona of Universitat Politecnica de Catalunya presented on process mining calculation strategies. The theory of regions shows how to derive a Petri net representation of a process model from the process log, which shows the transition between states, but it’s very computationally expensive. This paper deals with ways of making that computation less expensive in order to deal effectively with large logs.

First is a decompositional strategy, which partitions the regions in a way that allows the identification of a set of state machines that cover all the events, then uses parallel composition to assemble the state machines into a Petri net.

The second approach is a higher-level divide-and-conquer strategy, where the event log is recursively partitioned by event class until the log sections are small enough to use other techniques. The clustering of the events is the key thing here: first, compute the causal dependency graph, then use spectral graph theory to find clusters of highly related events that will be partitioned off into their own section of the event log.

What they’ve seen in experiments using this technique is that there is a significant computational improvement (from minutes to seconds) from the decompositional approach, and that the divide-and-conquer approach allows for the processing of event logs that are just too large for other techniques.

You can get Genet, the tool that they developed to do this, here.

Leave a Reply Cancel reply