Simulating a Health-Tech Marketplace Part II — SimPy

Published in

Data-Science-Lite

6 min readAug 28, 2023

Quick Recap

If you haven’t read my first post, go back and check it out —link . TLDR: I work for a health-tech startup, we want to simulate a day in our marketplace to answer these questions:

How much supply do we need?
What if our demand increased by x?
Who is most/least critical to the network?
What should we pay for a visit? How much can a clinician expect to earn?
What is the ROI of a specific incentive?
What if we routed differently? What algorithm is best?
What if clinician A had a different license set or client set?
What is the expected wait time for a patient at a given moment?

Wheel routes patients to a network of clinicians, similar to how you would be connected to a driver in Uber. However, we have many clients who use our platform to provide care to their patients so we are a B2B company. Each client has different requirements and our clinicians serve many clients, so how do we plan our network to most effectively and efficiently complete our clients’ consultations?

We are going to spend today unpacking some key requirements for the simulation. In a later article, we will kick off our SimPy python journey.

Key Requirement 1: My Clinicians are unique, as are my patients…

In most simulation models we tend to generalize attributes, one waiter is constructed no different from any other waiter. You will see examples with banks, restaurants, car washes and hospitals, all of which assume every one of their resources are the same.

In the Basics of SimPy you will see that each resource will have its own definition and quantity (example below). So every Nurse is the same and every Nurse Practitioner is the same . In the simulation you just name the quantity of the resource — which works for a brick and mortar but not for our marketplace where each clinician has unique attributes.

However the clinicians (supply) we work with all have different attributes:

State Licenses are unique
Payer Enrollments are unique
Clients are unique
Schedules are unique

Each of our consults (demand) are unique too! Their attributes are:

State
Client
Consult Duration
SLA
Modality: asynchronous (chat) or synchronous (phone)

A ridiculous analogy:

Imagine a restaurant that makes every cuisine in the world, customers can sit anywhere and choose any cuisine. Chefs can only serve the cuisines they are trained in, so if you aren’t trained in the art of smoking brisket, you can’t make it. Each cuisine has different wait times, processes, their own dishes, which mimics how consults work.

I want my chefs to be able to make many cuisines and the way I will track what cuisines they can serve is by creating a class that represents the Chefs unique attributes. So if they are trained in multiple cuisines, they will be able to serve [Peruvian cuisine, Tanzanian cuisine, American cuisine, Korean cuisine, Sri Lankan cuisine]. Remember, there are thousands of cuisines and if you can have any combination of cuisines, we need to represent each Chef uniquely.

Requirement #1:

I want to be able to upload a list of clinicians that have unique attributes (listed above). They can only serve consults that match these attributes.

Key Requirement 2: Custom Queue Prioritization

Pretend we have 100 consults that arrived at 7AM. Which clinician takes each consult? Why? What order? How do you determine which consult is at the top of the queue?

SimPy has some built in routing logic, FIFO — first in, first out. But what if you have some secret sauce queue prioritization logic you want to test? You could use the priority resource behavior, however the priority order of the consults is ever changing. I’d rather calculate a score of some kind and rank the consults. Our consults also follow many different kinds of queueing priority logic — depending on clinicians and consult characteristics. Each clinician might take the consults in a different order — so this could be clinician specific, it could also be consult specific.

Going back to the restaurant example:

Chef McElduff might prioritize Italian cuisine first, then Peruvian, then American cuisine.
Chef Buck might prioritize American, then Peruvian, then Korean cuisine.

Requirement #2:

I want to use my custom queuing logic. It’s dynamic and ever changing.

Key Requirement 3: Time Distributions

Let’s revisit the restaurant example — each dish would take a different amount of time to prepare. If you look at 1000 chefs you can understand the typical time (aka average) it takes to make or a distribution of durations (whether for cooking food or completing a consults).

Chat GPT: A probability distribution is a mathematical function that describes the likelihood of obtaining the possible values that a random variable can take. It provides a way to model uncertainty and variation in data, such as samples collected in statistical studies.

Spoiler: We chose to represent our times using a cumulative probability distribution, but the selection can vary based on the volume of data you have. Alternatives such as normal, logarithmic, or exponential distributions might be a better fit depending on the specific data characteristics.

Italian Cuisine

Carbonara Pizza

Time distribution: 12–15 Minutes
SLA (time a customer is willing wait): 20 minutes

Chicken Parmesan

Time distribution: 15–20 Minutes
SLA (time before a customer leaves restaurant): 25 minutes

American Cuisine

Hamburger & Fries

Time distribution: 4–7 Minutes
SLA (time before a customer leaves restaurant): 15 minutes

Chicken Strips & Waffle Fries

Time distribution: 10–15 Minutes
SLA (time before a customer leaves restaurant): 20 minutes

We want to create a unique probability distribution for each dish (consult attribute). It might help us answer some questions about our overall strategy. What if you had an item that was very labor intensive to prepare but it’s profit margin was low? This gives you the ability to model what would happen if that item wasn’t on the menu. Your overall SLAs for each dish might improve.

The same goes for our Health Marketplace, there are network effects at play. What if we added expert clinicians focused on just one client that had high efficiencies? OR what if we added clinicians that are multi-licensed generalists — less efficient but cover more ground. What is the best strategy for our supply?

Requirement 3:

Use custom probability time distributions for each Client + type of care they deliver. (or in the restaurant example, Cuisine + Dish).

Key Requirement 4: Consults arrive at what time?

We went back and forth on whether to model this with an arrival time distribution of some kind. We thought we could do something like: x consults arrive every y minutes — using a poisson distribution. It’s super common in a world of limited data but let’s look at our options below:

Option 1: 🤮

Each client/treatment area could have it’s own poisson arrival process, but then I need a separate generator function (more on generators later) for each client/treatment.

Option 2: 🤔

We do one poisson distribution and create a probability distribution that gives each consult a probability of being a specific client/treatment area.

Option 3: 😅

We use historical data as the source of truth. So arrival times are exactly the same as the way they happened in real life for each client/treatment area.

Option 2 isn’t bad, it would take some fine tuning but it would allow me to scale up and down the volume at ease. However, to keep my model rooted in reality to start, I want to use actual arrival times of our consults.

Most places don’t have the luxury of understanding every data point, they are limited to some 100 samples so they find a distribution to represent their population. I am fortunate to have hundreds of thousands of data points for this exercise. 😀

Requirement 4:

Go with Option 3: use historical arrival times as the input for consult creation times.

All Requirements

I could go on for days about each requirement — but I wanted to pull out specific requirements where my approach to simulation might differ from the normal approach. It’s important to me that each resources’ unique attributes are reflected as much as possible in the simulation. Below is a list of my requirements — reach out if you want to understand any more in depth!

In the next article, we will focus on navigating through the syntax of SimPy and begin structuring our code.