Process Mining with Python tutorial: A healthcare application — Part 2

Published in

Wonderful World of Data Science

4 min readAug 4, 2020

This article is the second of a tutorial series made up of the following parts:

Part 1: Introduction to process mining, data preprocessing and initial data exploration.
Part 2 (this article): Primer on process discovery using the PM4Py (Python) library to apply the Alpha Miner algorithm.
Part 3: Other process discovery algorithms and model representations.
Part 4: More holistic models which integrate control flow, time (e.g. bottlenecks, wait times), resources (e.g. personnel capacity and performance, inter-personnel relationships, department/ward capacity and performance), case attributes (e.g. patient demographics, clinical condition).

You can find the complete source code and data for this tutorial series here. In Part 1 of the tutorial, you saw how to prepare and explore your data to get a high level ‘feel’ for the processes. In this part you will learn how to use the pm4py library to discover process models by working with an example dataset (the same dataset as used to illustrate exploration in Part 1).

Formatting the event log for PM4Py

The pm4py library works with both CSV/standard pandas dataframe and XES formats. XES is a standard format used to store event logs, but it is not adopted in all contexts. The two relevant objects are converter from pm4py.objects.conversion.log (here aliased log_converter), which…

Process Mining with Python tutorial: A healthcare application — Part 2

Formatting the event log for PM4Py

Written by c3d3