How GenStage Kept Medical Records Current
Everyone has probably heard the term that your backup is only as good as your restore. When it comes to medical records, you’re only good if your data is current. The question is — how do you stay up to date with hundreds of thousands of records?
One of our clients faced this exact question.
One of their core services is providing patient outreach capabilities via SMS to their partners. These include some of the largest hospitals in Africa. Messages might remind patients of an annual physical or give tips on how to lower their cholesterol. In order to be effective, these messages must be targeted.
This requires current data.
We’ve all accidentally sent the wrong person a text message — when dealing with healthcare, it’s critical that we don’t make that mistake. Sending a message when a health condition no longer applies or if the patient has passed is unacceptable.
So let’s imagine the following scenario: we have 150k patients in our system that fit the criteria for an SMS. In actuality, 142k match the hospitals records, and the data is out of date. We need to confirm every one of the 150k records before sending the message.
Well, we’re dealing with Elixir right? Let’s just hammer the hospital’s API. We could spawn a number of async tasks and wouldn’t miss a beat. Elixir’s ability to create a large number of processes make this trivial. As you can imagine, we could overwhelm their servers. Some services may even restrict the maximum number of API calls we can make.
We need a way to limit the number of calls we make at any given time to a specific partner hospital.
We need GenStage.
What makes GenStage different from a typical worker queue? Consider the experience of going to a nightclub. The nightclub has a max capacity and so once that capacity is reached, a line forms outside the door. A doorman lets folks know when there is available capacity and can let people through. Once in the door, they can pay a cover and possibly leave something at coat check.
Thinking on this example, you might realize that it is the available capacity that drives the process. Not only does it drive the process, but this dictates the number of people who can be let in. Someone inside the club indicates there is capacity for two. This message is sent to the person collecting the cover charge, who informs the doorman to let two people in. This is the pipeline.
Typical worker queues involve a storage mechanism and a batch of workers that come by and take jobs off one at a time. You might think of the storage mechanism as a producer of jobs and the workers as consumers of these jobs.
GenStage adds a wrinkle to this paradigm. The consumers tell a producer how much demand they can handle. The producer then returns jobs up to the maximum requested. In our earlier example, this is the club employee telling the cover charge employee to let people in with a max demand of two.
Given that this is Elixir, it should be no surprise that the concept of pipelines makes an appearance in GenStage. Although we’ve only talked about producers and consumers, there is one other type. A producer-consumer.
A producer-consumer receives demand from a consumer and passes it up to the next layer in the pipeline. This continues until we reach the producer who returns jobs. Each stage in the pipeline can then perform a step in the process and pass along the result to the next stage.
In our example, the producer is the line of people out the door. The producer-consumer could be both the cover charge and the coat check. Finally, the consumer is the club itself.
Let’s take a look at how this applies to medical records. Let’s first think about the steps to the problem:
- Retrieve medical record from system of origin
- Massage medical record into a format matching the application’s schema
- Update the local medical record
- If the medical record still fits the message criteria, send the SMS or email.
When first learning GenStage, you might think to separate each of these steps into it’s own pipeline step. After all, we see them as different steps in the process — why shouldn’t they be their own stage?
Consider each stage as a unit of concurrency / fault tolerance / back-pressure rather than of code organization. — José Valim
If you look at these steps through this lens, we might group things differently.
For example, retrieving the medical record requires it’s own sense of back-pressure. Taking the records and transforming it could be held within the same step to isolate places that have to know about the transformation/mapping.
Updating the medical record is a separate action that is best kept to it’s own stage. This process can make the necessary updates before handing it off to the final consumer.
Validating whether or not the message criteria is still met by a given patient is the last step that belongs in it’s own stage. In doing so, we isolate both this and the sending of the message.
If you’re coming from object-oriented programming, it’s only natural to want to divide each step to it’s own stage. The key thing to remember is that each stage provides the levers to control concurrency, fault tolerance, and back-pressure.
Want to learn more? Come to to Empex NYC where I’ll be giving a talk on how we implemented this solution.