Process Mining with Python tutorial: A healthcare application — Part 2
This article is the second of a tutorial series made up of the following parts:
- Part 1: Introduction to process mining, data preprocessing and initial data exploration.
- Part 2 (this article): Primer on process discovery using the PM4Py (Python) library to apply the Alpha Miner algorithm.
- Part 3: Other process discovery algorithms and model representations.
- Part 4: More holistic models which integrate control flow, time (e.g. bottlenecks, wait times), resources (e.g. personnel capacity and performance, inter-personnel relationships, department/ward capacity and performance), case attributes (e.g. patient demographics, clinical condition).
You can find the complete source code and data for this tutorial series here. In Part 1 of the tutorial, you saw how to prepare and explore your data to get a high level ‘feel’ for the processes. In this part you will learn how to use the pm4py library to discover process models by working with an example dataset (the same dataset as used to illustrate exploration in Part 1).
Formatting the event log for PM4Py
The pm4py library works with both CSV/standard pandas dataframe and XES formats. XES is a standard format used to store event logs, but it is not adopted in all contexts. The two relevant objects are converter from pm4py.objects.conversion.log (here aliased log_converter), which…