The Curious Case of Open-Source Ventilators

10 min readApr 13, 2020

Recently, the entire planet has been taken over by COVID-19 pandemic. As I started writing this article, the total death count due to this virus was 113, 296.
This pandemic has created a shortage of ventilators which is a crucial mechanism to treat a COVID-19 patient who are unable to breath on their own. Last year, the total number of ventilators needed to meet the worldwide (or I’d say planet-wide) demand was 77, 000. Now, additional 33,000 ventilators are needed in the New York alone. The biomedical manufacturers have boosted their production by 30%-50% but cannot alone meet a 1000% increase in demand.

There’s another side of the grass, where groups are working on developing low-cost open source ventilators to meet this demand. Various online videos, articles and instructables have posted steps for any enthusiast to build a DIY ventilators by using inexpensive raw materials (figure-1).

Figure-1: A bag-valve-resuscitator type ventilator design

Most of these open source ventilators are based on Arduino, which is a good and easy to use micro-controller platform. Many groups have already developed a functional prototype and are under test. But there are some serious caveats and gotchas in ventilators based around this platform.

This article will let the readers know the alarm bells, red flags, processes, practice and criticality involved in the design of this crucial and life-saving medical equipment.

Ventilator as a Black Box

Here is a quick review of a mechanical ventilator as a black box as shown in the figure-2.

We have air, oxygen and of course electric power as an input and a controlled flow rate, pressure and breathing cycles as output. So in essence, a mechanical ventilator is a cyber-physical system where valves, pressure, flow, oxygen, air and respiratory cycles are the physical parameters needed to be controlled by an embedded system. That too very precisely.

So in its entirety, this entire operation (or mission) is run by an embedded software which must make sure that all of these quantities are measured and controlled in a timely and accurate manner. This makes this system a safety critical embedded system. Designing such embedded systems need careful thought, a standard design pattern and procedure and a lot of testing to make sure the system never fails. Because if it does, the result could be the loss of life.

Current open source designs of ventilators have put too much thought in to innovative 3D printed designs, use of creative ways to push the an AMBU bag through the use of actuators and utilized inexpensive sensors to measure pressure and flow rates. Some groups have even used medical grade sensors to get them a step ahead. But just integrating a medical standard sensor into the micro-controller and read-out the ADC values without using a standardized approach to embedded software development can cause trouble.

It all boils down to Software

Embedded systems respond and react to the environment via sensors and actuators. This imposes a restriction on embedded systems to achieve performance which is consistent with the environment. Today, a major portion of performance of embedded systems depend on its software. The embedded software patterns has evolved over the years but typically fall in to three categories.

Procedural, Cyclic Execution: The software starts from a point and performs all tasks sequentially and then exits. It does all that in a loop for example systems power on, task#1, task#2…..task#n then loop again to task#1 and so on.
State Machines: Probably the well known and widely used design pattern in embedded software development. Each process, procedure or task is a state. Upon some condition, the state transitions in to another state where there’s another task or processing element. This pattern gives more control over the flow of main program (figure-2).
Few others are queued state-machines, producer-consumer, publish-subscribe. There are well known design patterns for hardware drivers (sensors, actuators), for compute intensive processes and for safety-critical processes.

All Ventilators are Real-Time Systems

As I have mentioned above, a ventilator must make sure the timely delivery of breaths whilst controlling the pressure and flow rate of air and oxygen mix. This makes it a real-time system because it must meet the strict requirement in terms of response time. Failure to respond in time may cause loss of life.

A COVID-19 struck patient needs a ventilator because they cannot breathe on their own. So they need a mechanical assistance that can help them breathe. A ventilator does that by providing mandatory breaths known as Continuous Mandatory Ventilation (CMV). This gives a minimum number of breaths to the patient. In this, the breath timing should be controlled accurately otherwise any delay in providing these breaths or too frequent breaths can cause life and death situation.

This makes a ventilator a Real-Time system. The current makeshift designs lack timed software procedure calls. They might seem to be doing there job in time but we cannot make sure how fast or slow do these calls are made. There is no mechanism of testing. The deviation of a loop or program from its specified deadline is called Jitter. Such systems require Real-Time Operating Systems to function where the scheduler makes sure that certain loop takes a specified amount of time to execute. Such systems are also known as Hard-Real Time Systems.

System designers must also divide their embedded application in to time critical and non time critical sections. For example, controlling the position of pressure valve and expiratory valves should be done in hard-real time manner and hence should be executed under strict deadline. This is a deterministic section. Whereas, updating the user interface screen is not crucial in relation to valve control and thus should be executed in a non-time critical loop. It is okay for the system to be late few seconds in updating graph on the screen than delaying breathe cycles.

Figure-4: Real-Time System with deadline and jitter

All Ventilators are Safety-Critical Systems

This encompasses all. A ventilator is meant to save lives and give breaths to the patient. Any minor failure will have only one outcome; a loss of life.

Since we know that ventilators are real-time systems but not all real-time systems are safety critical systems. There is an extra layer of process and design procedure involved in it. This time the stakes are high because in safety-critical systems, failure to meet the deadline will cause death.

Here’s an example, suppose in a ventilator a sensor readings occasionally deviates from its normal values for reason unknown (say X). This X event occurs again and causes the sensor value to deviate again and with time this starts to happen more frequently. Any deviation will be registered by the consequent process (say a valve control) which will also produce invalid result. Small defects can cascade in to life-threatening events very quickly as shown in Figure-3. These faults or defects can be as little as un-initialized variable, delay in memory read/write or something as big as cosmic-magician spell.

Figure-5: Small faults can lead to accidents

As shown in figure-5, a complete failure in some small part of software sub-system (say level-1) is seen as a fault in a level higher (level-2). The failure in level-2 is seen as a fault in level-3 and this cascade goes on till it meets the highest, top-level loop. If this avalanche is not prevented, it can cause an overall system failure which, in case of ventilator, isn’t affordable. So not only a ventilator is safety-critical system, but also a high-availability system. Congratulation, the stakes have just gone up.

The fundamental question needed to be asked to the open source ventilator designers who have based it on Arduino platform is “what if some patient who is on their ventilator, dies?”. “Who is to blame?”. The answer is not straight-forward. But this could get seriously dangerous for the designers. A safety-critical system must meet some standard.

For example, if a ventilator is needed to be operated 24 hours a day, seven days a week, and is specified by the designer to have an up-time of 99%. Then exactly which 1% of the entire day is it meant to fail? Are there any tests carried out?

What if a sensor fails? Is there any redundant sensor in place? If there is, how much time will it take for the software to switch to the other sensors before starting the next breath cycle? Is the switch bump-less? What if the second sensor fails as well. Is there a safe shutdown? Safety critical systems usually take themselves in to a safe shutdown to break the fault-error-failure-hazard chain as shown above.

Adding Redundancy in Safety Critical Ventilator Machines

There are various design patterns and methods to incorporate redundancy in sensors, actuators and even software processes. They are either based on hardware redundancy (adding more redundant sensors/actuators) and software redundancy (adding redundant software sections).

This means adding more hardware components and software block to control the switch. Some well known and tested redundancy patterns are:

Homogeneous Duplex Pattern: Adding another sensor and using simple compare to switch between sensors. This model assumes that two identical hardware components will never face random fault simultaneously. This is also called standby redundancy (figure-6).
N-Modular Redundancy: also known as Parallel Redundancy, refers to the approach of having multiply units running in parallel. All units are highly synchronized and receive the same input information at the same time. Their output values are then compared and a voter decides which output values should be used. This model easily provides bump-less switch-overs. This method has dual and triple modular redundancy approaches (figure-7).
N-Version Programming Pattern: This pattern includes N programs that are running in parallel to perform the same task on the same input to produce N outputs. A voter is used in this pattern to produce the correct output; it accepts the N results as inputs, and uses these results to determine the correct output according to a specific voting scheme.

Redundancy Improves Availability

Availability is the percentage of time that a system is up and running for a particular mission time.

Availability = Uptime/(Uptime + Downtime)

If you have a mission time of 24/7 for six months and have no downtime, then you would have 100% availability. If you have one day of downtime for that same mission, then the availability of the system becomes 99.4%.

How Open Source Designers Can Benefit

We know that Arduino is easy to use. But it is not a Real-Time or Safety-Critical system. The original intent to create Arduino was for learning purposes. Deploying an Arduino based ventilator design in medical ward is not a good idea. The reason why Arduino isn’t suitable for safety-critical systems is the lack of hard-real time resource, limited computation, testing and debugging. Having no debugging makes finding a bug in thousands of lines of Arduino code, impossible.

Additionally, you may want to test your embedded system with sensors and actuators attached with real physical parameters, in a simulated environment. As if the system is actually exposed to patient physical dynamics and respiratory physics etc. This calls for hardware-in-the-loop testing.

Arduino is a great tool. If designers still want to leverage this tool for open source ventilator, following are few suggestions and practices to adhere to:

Use powerful debugger such as Atmel Studio, external IC4 debugger.
Resort to design patterns in C/C++ and coding standards such as JSF, Bar Group’s BAR:C 2018 embedded C coding standard or MISRA:2012 or before. These standards talk about formatting, classes, complexity management etc.
Use an RTOS. The Arduino isn’t capable of running a dedicated OS but you can use FreeRTOS in Arduino to avoid system lockups.
Use standardized processing with RTOS, Middleware and Device Drivers standardized for at least Class-B.
Utilize redundancy.
Use watchdog timer if you do not have luxury of RTOS. This will make sure your device keeps operating and resets if problem occurs.
Do not rely on single chip design. Use distributed architecture for embedded design. For example, you can dedicate one Arduino for sensor inputs, another one solely for feedback control and a dedicated Arduino as a fail-safe co-processor.
Most of all, adhere to medical standards in functional safety such as IEC-60601 and IEC-61508.

WRAP UP!!

In this article we saw what embedded systems are, how ventilator is essentially an embedded system and what makes in a real-time, safety critical system. We have also seen different programming and embedded software paradigms, redundancy and different design pattern.

I hope the readers have a better understanding of what goes in to designing an open source ventilator.

I am listing few books and resources to get you started.

“Design Patterns for Embedded Systems in C”, by Bruce. Powel Douglass.
“Embedded Software Development for Safety-Critical Systems”, Chris Hobbs.
“Software Engineering of Embedded & Real-Time Systems”, by Robert Oshana.
“Abstract State Machine”, by Egon Boerger & Robert Stark.
Embedded C Coding Standard by Bar Group (pdf online).