By Dirk Fahland.
Analyzing processes supported by an ERP system such as Order-to-Cash or Purchase-to-Pay are one of the most frequent use cases of process mining. At the same time, they are one of the most challenging, because the processes operate on multiple related data objects such as orders, invoices, and deliveries in n:m relations.
Event log extraction for process mining alays flattens the relational structures into a sequences of events. The top part of the following poster illustrates what goes wrong during this kind of event log extraction: events are duplicated and false behavioral dependencies are introduced.
A possible way to prevent this flattening is to extract one event log per object or entity in the process: one log for all orders, one log for all invoices, one log for all deliveries. The result is a so-called artifact-centric process model that shows one “life-cycle model” describing the process activities per data object.
But analyzing the process over all objects also requires to extract event data about how objects “interact”. Technically, this can be done by extracting one event log per relation between two related data objects (or tables). From these, we can learn the flow and behavior dependencies over different data objects.
Decomposing the event data in this way into multiple event logs ensures that event sequences either follow one concrete data object or follow a concrete relation between two related related data objects. The resulting model only contains “valid flows”.
- Erik H. J. Nooijen, Boudewijn F. van Dongen, Dirk Fahland: Automatic Discovery of Data-Centric and Artifact-Centric Processes. Business Process Management Workshops 2012: 316-327
- Xixi Lu, Marijn Nagelkerke, Dennis van de Wiel, Dirk Fahland: Discovering Interacting Artifacts from ERP Systems. IEEE Trans. Serv. Comput. 8(6): 861-873 (2015)
- XTract. Software for artifact-centric log extraction: https://svn.win.tue.nl/trac/prom/browser/XTract