Multi… Process Mining at ICPM 2020 shown by Industry

By Dirk Fahland.

Here is a list of activites by the Process Mining vendors at ICPM that I have seen in the program that – in my view – will have a relation or show use cases for Multi… Process Mining.

  • ABBYY, Thursday, October 8 – 12:30h: “Logistics – Application of process intelligence beyond one system of record”
  • Celonis, Thursday, October 8 – 13:00 “Language for Processes” (Celonis’ process query language inevitably will touch on looking beyond event data in a trace). 15:45h: “New Teaching Case Studies for Process Mining in Audit and Accounting”. Possibly also Celonis’ Process Query
  • EY, Thursday October 8 12:45 “Process Mining in auditing”, 16:15 “Working Capital Management” (repeated Firday October 9 twice)
  • MyInvenio, Presenting Multi-Level Process Mining throughout their product demos on 7th, 8th, and 9th October
  • UiPath, Thursday Oct 8 13:00-13:30 – Data Transformation & Connectors (data extraction and transformation from complex source systems for process analysis)
  • LanaLabs, Thursday Oct 8, 1.30pm – 1.50pm: Use Case P2P & LANA Connect (data extraction and transformation from complex source systems for process analysis) + Oct 9 on-demand Tool Demos.

It seems that Thursday 9th October 12:30-14:00 is high-noon for Multi…Process Mining of the process mining vendors at ICPM.

I may have missed some relevant events – if I did, let me know in the comments.

Artifact-Centric Process Mining for ERP-Systems with Multiple Case Identifiers

By Dirk Fahland.

Analyzing processes supported by an ERP system such as Order-to-Cash or Purchase-to-Pay are one of the most frequent use cases of process mining. At the same time, they are one of the most challenging, because the processes operate on multiple related data objects such as orders, invoices, and deliveries in n:m relations.

Event log extraction for process mining alays flattens the relational structures into a sequences of events. The top part of the following poster illustrates what goes wrong during this kind of event log extraction: events are duplicated and false behavioral dependencies are introduced.

Poster summarizing process mining on ERP systems

A possible way to prevent this flattening is to extract one event log per object or entity in the process: one log for all orders, one log for all invoices, one log for all deliveries. The result is a so-called artifact-centric process model that shows one “life-cycle model” describing the process activities per data object.

But analyzing the process over all objects also requires to extract event data about how objects “interact”. Technically, this can be done by extracting one event log per relation between two related data objects (or tables). From these, we can learn the flow and behavior dependencies over different data objects.

Decomposing the event data in this way into multiple event logs ensures that event sequences either follow one concrete data object or follow a concrete relation between two related related data objects. The resulting model only contains “valid flows”.

More

Performance Spectrum for Analyzing Business Processes

By Dirk Fahland.

In this post, I show how simple data visualizations help understanding the multi-dimensional nature of processes. Even the most simple classical business processes have important dynamics that cannot be understood by cases in isolation.

The Performance Spectrum was originally designed to deliver a useful process map for analyzing logistics processes over time. When applying the same technique to business process event data, the performance spectrum proves equally useful as it unveils process characteristics that were so far hidden by existing process mining tools:

  • performance of business processes actually varies greatly over time,
  • different cases are much more influencing each other (and are influenced by external mechanisms such as shared resources or policies) that previously shown,
  • processes not only have multiple variants in control-flow but also multiple, overlapping variants in performance,
  • different processes have very different performance spectra, but processes from a similar domain show similar performance spectra.

To support and trigger further research in this area, we released a smaller-scale version of the Performance Spectrum Miner enabling the analysis of business process event logs. The platform independent (requiring JRE8) tool is available as a

Examples of Performance Spectra

Below, we show some performance spectra of publicly available event logs.

Road Traffic Fine Management process

In the figure below, we see the typical process map or model of the Road Traffic Fines Management Process event log on the left. It describes the possible behavior of handling a single case (a traffic ticket). The arc width and annotations tell how long it takes a case to go from one activity to the next.

On the right we see the Performance Spectrum of this process. Each horizontal stripe describes the transition between two activities – called a segment. Each colored diagonal or vertical line is a case passing through this segment over time. The longer and more diagonal the line, the longer the case took.

Performance Spectrum of the main variants of the road traffic fine management process

We can immediately spot very different patterns in each of the segment, clearly showing that the cases are not handled in isolation, but something manages their progress. We can see

  • Payments from traffic offenders happening at various rates, and changing density
  • Multiple incoming cases being batched together during the “Send Fine” step and afterwards being processed individually (non-batched again)
  • While some “Send Fine” steps are being executed immediately
  • Penalty is being added after a fixed delay leading to a FIFO behavior in the process
  • However, cases arrive at the “Add Penalty” step in larger groups at irregular intervals leading to emergent batches of penalty notifications – with different success in the speed of payment
  • Credit collection always happening in larger batches for cases not paid 6 months prior

Looking at a single event log, we can see that even a classical process over a single entity (a traffic ticket) is subject to dynamics beyond the scope of a single case.

The Credit Application process of BPIC12

We can see

  • A weekly working pattern where most process steps are concluded on Monday-Friday the same day and mostly in FIFO order, though some work takes place for some steps on Saturdays
  • However several steps also show violations to the FIFO behavior and longer waiting times spanning more than one day
  • Cases and work that is coming in on the weekend and overnight is being processed the very next working day early on

The same Credit Application process 5 years later in BPIC17

  • The weekly working patterns seen in BPIC17 remain, but more steps show
  • violations to the FIFO behavior.
  • In addition, we can observe batching, for example in the O_Accepted step taking place every month, although some cases are not included in the batch and processed immediately.
  • Cancellation after an offer was sent follows mostly a FIFO pattern, but with variable delays
  • The aggregated performance spectrum shows a lot of variability in the workload in the process, with some very unusual peaks in offers being created and later on cancelled over a longer period of time.
  • Towards the end of the dataset, the performance spectrum shows significant improvement in performance of the process across several steps (most cases there are now handled in the bottom quartile than in the top quartile)

The building permit processes of BPI15

  • The performance spectra of the two logs from two different municipalities are very similar to each other, but are very different from the previous logs.
  • The overall throughput per day is much lower.
  • Most cases are processed across the steps in a FIFO manner, though not following a strict working week pattern, and cases are handled on the same day across multiple steps.
  • Processing seems to happen in stages where certain steps are performed for all cases together for a certain period of time, while there is no activity in other steps at the same period.

Your turn!

What can you find in the Performance Spectra of the other public event logs, or your own data? Get the Performance Spectrum Miner from http://www.promtools.org/ or from https://github.com/processmining-in-logistics/psm  and try it out!

The Performance Spectrum

Dirk Fahland

One of the core challenges of process analytics from event data is to enable an analyst to get a comprehensive understanding of the process and where problems reside.

In business process mining such an overview is obtained with a process map. It can be discovered from event data to visualize the flow in the process and highlight deviations and bottlenecks.

Process maps of logistics processes do not give these insights: they are too large to comprehend, the maps do not visualize how processing of materials influences each other, and – as they show an aggregate of all event data – they fail to visualize how performance and processing varies and changes over time.

In the “Process mining in Logistics” project by Eindhoven University of Technology and Vanderlande, we therefore developed a new visual analytics technique which we call the Performance Spectrum:

  • The performance spectrum maps out process dynamics for all steps and all cases over time, by adding a “time axis” to the process map.
  • The performance spectrum visualizes each case and each step over time individually allowing analysis to see how materials and cases of a process are handled together and how they influence each other.
  • The explicit visualization of all cases together reveals how process deviations and short- and long-term performance problems evolve over time and influence each other.

The image below shows the performance spectrum of a baggage handling system along a sequence of check-in lines over time. Bags are put into the system at point a1 and then are moved via conveyor belts to point a2. Each blue or orange line in the top-most segment a1:a2 in the performance spectrum shows the movement of one bag from point a1 to point a2 over time. The angle (and color) of the line indicates its speed.

Performance Spectrum of Check-in Aisles in a Baggage Handling System

As shown on the layout schema below, further bags enter the system from another check-in point a2 and are also moved to point a2, where both flows merge on the segment a2:a3, etc. All bags eventually reach the point “s” from where the bags are routed further into the baggage handling system. In the performance spectrum, we can see the movement of a bag over these segments through the consecutive lines.

Layout of Conveyor Belts in Check-in Area of a Baggage Handling System

As bags cannot overtake each other on a conveyor belt, we can immediately identify in the performance spectrum several behavioural patterns:

  • Normal operations, for example in the left part of the performance spectrum, show how bags flow tougher from the check-in points to point s, each segment having its own speed, and no bags are overtaking each other.
  • Repeated operational problems can be seen in the segment a2:a3 (orange-slanted lines) where the conveyor belts are halted for a certain period, leading to significantly delayed processing, to no flow in segments a3:a4 and a4:a5, and to backwards queuing in segment a1:a2, while segment a5:s is unaffected as the bags coming from a5 can move freely.
  • After the short-term performance problems are resolved, the system shows recovery behaviour under high-load as the system resumes to normal operations, visible by a large number of bags (many lines close together) moving two times slower than normal (light blue).
  • Moreover, the repeated performance problem was already briefly visible in a2:a3 in the initial phase (showing a group of bags moving 3x slower than normal).

The visualization allows process managers and engineers to both quickly locate the cause of the problem to prevent it happening in the future. In particular the briefly-visible performance problem in a2:a3 prior to the halt of the conveyor belt can be identified as an early warning signal to detect possible performance problems in the future, and also to understand and improve system recovery behavior.

We realized this technique in a high-performance visualization tool which we call the Performance Spectrum Miner. It has proven itself reliable to:

  • analyze very large amounts of event data (of over 100 million events),
  • quickly identify temporary process deviations in very large processes,
  • quickly locate short- and long-term performance problems as well as gradual and abrupt changes in process performance,
  • identify the root-cause of performance problems and deviations in logistics processes occurring only under certain conditions.

We released a smaller-scale version of the tool (as a ProM plugin or as standalone tool) together with a manual on  https://github.com/processmining-in-logistics/psm.

More information

Multi-Dimensional Process Thinking

Dirk Fahland

I am trying to sketch the landscape of describing, analyzing, and managing processes outside the well-established paradigm of a “BPMN process” where a process is executed in instances, and each instance is completely isolated from all other instances.

Thinking about Processes

Let me introduce the term “process thinking”.

Process-thinking is the fundamental paradigm for understanding, designing, and implementing goal-oriented behaviors in social and technical systems and organizations of all kinds and sizes.

Process thinking structures the information flow between various actors and resources in terms of processes: several coherent steps designed to achieve common and individual goals together.

Throughout a process, multiple actors, resources, physical objects and information entities interact and synchronize with each other.

The scope of process thinking varies depending on the system and dynamics considered based on “how many dynamics to consider?” (outer scope) and “how many entities describe these dynamics?” (inner scope).

“BPMN Process Thinking” and Classical Process Mining
One Execution – Single Entity

BPMN and classical process mining focus primarily on describing and analyzing information handling dynamics as they are found in many administrative procedures, for instance in insurance companies or universities.

Processes are scoped in terms of individual cases (or documents) whose information is processed along a single process description independent of other cases, often in a workflow system. In terms of scoping, such processes encompass a single-dimensional inner scope (information processing) structured into a single-dimensional outer scope (along a single case).

One Execution – Multiple Entities

Most organizations operate multiple processes sharing data or materials which requires to consider multiple processes and objects and their interlinked dynamics together.

Process thinking around dynamics in manufacturing and retail organizations, such as Order-to-Cash or Purchase-to-Pay processes, is often supported by complex Enterprise Resource Planning (ERP) or Customer Relations Management (CRM) systems.

Processes here are centered around updating and managing a collection of shared and interlinked documents by various actors together leading to mutually dependent and interconnected dynamics of multiple objects and processes (multi-dimensional outer scope) with a focus on information processing (single dimensional inner scope).

Taking the system into the picture

While information processing is the dominant behavior analyzed in process mining, the dynamics of a process may also be characterized and analyzed in other dimensions.

For example, how actors and resources, physical materials, and the underlying systems participate in the processing of cases of the same process (inner scope of process thinking)

  • How are actors and resources involved in the dynamics – and how does the involvement of actors and resources influence the dynamics, for instance through availability, workload, and capabilities?
  • How are physical materials involved in the dynamics, for instance through transporting or storing large amounts of materials via conveyor belts or vehicles. How do their physical properties and constraints influences the dynamics?
  • How are the underlying systems are involved in the dynamics, and how do their capabilities and limitations influence the dynamics, for instance through queueing, prioritizing, and assigning of work or the (reliability of) automation of steps?

the progress of a case depends on availability of information, actors, and corresponding materials alike

In most processes, these different factors of processing are not independent but influence each other as the progress of a case depends on availability of information, actors, and corresponding materials alike, and is subject to limited availability of processing resources, and physical limitations of the supporting systems, which requires multiple dimensions to characterize a single dynamic (inner scope).

Multiple Executions – One Entity

Processes for manufacturing and logistics, such as baggage handling at airports combine information handling with material flows.

Physical items are processed along a logical process flow – and at the same time have to be moved around a physical environment of conveyor belts, carts, machines, and workers. Steadiness of flow is the central process objective.

In this characteristic, the processing of one material item depends not only on the logical process it has to go through but also on all other items that surround it: they together define whether work accumulates at a particular machine, work cannot be completed at the desired quality, or target deadlines are met. Did your bag reach the flight?

Call centers and hospitals are other examples where the processing of one case highly depends on what happens with other cases. A long waiting time in a queue can make a customer service contact go very differently. The quality and next steps in a medical treatment depend on how well the medical staff can focus on your case.

These phenomena cannot be observed, analyzed, and improved when studying each case in isolation.

Multiple Executions – Multiple Entities

More advanced logistics operations, such as warehouse automation and manufacturing systems, also consider material flows that are being merged together, through batch processing and manufacturing steps.

Analyzing and improving processes in such systems requires both a multi-dimensional inner scope and a multi-dimensional outer scope.

And now? Let’s talk…

What are your thoughts on this? Feel free to join and post a response here!