Towards AI-based only biosignal analysis pipeline

Published in

MAWI Band

12 min readApr 27, 2019

Throughout 2010-s computer vision, natural language processing, predictive analytics, and sound analysis were the most popular domains in Data Science. However, soon, in 2020-s, the situation will change. New domains will become more important, and one of the applications that will grow fast is biosignal analysis, particularly — analysis of electrocardiogram (ECG) alongside with accelerometer-based data analysis, encephalogram (EEG) analysis, and others. The main reason for this change is the introduction of new sensors that would facilitate a collection of this data. The only way to obtain such data right now (ECG in particular), is to use medical equipment (Holter devices or stationary 12-lead cardiographs) or to create a device by yourself.

ECG, a new must-have sensor

However, in September of 2018 situation has changed. Apple introduced Apple Watch Series 4 with ECG sensor. Of course, right now you can’t use them to collect raw data. But Apple is the only company whose technologies and features become the industry standard the day they are released. So, it can be assumed that soon ECG sensor may become an integral part of most of new smartwatches and wristbands. And as a result, the demand for analysis of this kind of data will also increase.

Apple watch ECG feature presentation

At Mawi Band, we started working with ECG years before it was “cool”. Back in 2014, we started developing our pipeline for biosignal analysis using open-sourced datasets. We even participated and won in several startup competitions back then :) But when we began to work on the actual products, we realized the problem with data collection described previously. So in late 2016, we started development of our portable ECG-measuring device. Here is how it looks today:

Mawi Band use case review

Since then we have developed a lot of different biosignal analysis based MVPs:

Wellness assessment and medical diagnostics app
Biometric authentication based on ECG
and many others, see them here

So, in this article, we will share our experience of using machine learning to improve and speed up ECG processing in our products.

ECG 101

If you are new to ECG you may think of it as of an electrical current that appears in our heart and can be measured to understand all of its activity. We suggest you study the following materials to dive into details.

Source: gfycat.com/ru/naiverealistichorseshoebat

Long story short, ECG is an electrical signal which contains a lot of information about our bodies. It consists of waves (P, T and QRS, you may see them on the animation above) and in most cases, you need to analyze these waves to obtain useful data. Even though a lot of different tasks can be solved using ECG, a single pipeline design can be used for all of them. And every stage of this pipeline can be improved with the help of machine learning, in particular, deep learning.

Signal filtering

ECG is an electrical signal, and so it can be influenced by various types of noise. The simplest case is power line interference. It happens when external alternating current influences ECG. Other types of noise include muscle noise which is caused by muscular electrical activity or noise caused by the movement of the person that conducts measurement. In practice this noise looks like the following:

ECG with (left) and without (right) baseline wandering

ECG analysis is not a new technique, and so a lot of ECG filtering approaches were developed. Most popular are FIR and IIR filtering. They are used to remove the noise of specific frequencies from the signal, but there is one problem. Often noise may have similar frequencies with P waves and T waves of ECG. Hence, you have two options:

Remove frequency band that includes all the noise and alter the ECG morphology making the signal irrelevant for analysis in most cases (but sometimes this approach is relevant).
Use a frequency band that doesn’t include valuable data and leave some noise in it.

Fortunately, the frequency of the alternating current is 50 Hz or 60 Hz depending on the country, and this frequency may be excluded from the signal without any risk.

Example of Power-line interference removal on ECG using IIR filtering

On the picture below, you may see an example of filtering with such filters (raw signal on the top, bandpass FIR filter with all the valuable frequencies inside the band and IIR filter that totally removes low-frequency noise (baseline wandering) but alters the morphology.

Example of performance of FIR (green) and IIR (blue) filtering for noise removal on raw ECG signal (black)

Mentioned filters perform quite well-removing baseline and powerline noise. However, they don’t handle the noise caused by body movements. This is not an issue for stationary ECG recordings, but in case of wearable devices, users tend to make some sudden movements during screening. Such muscle activity significantly corrupts the signal making it unsuitable for analysis. We struggled a lot in seeking a solution and came up with using machine learning in particular, denoising convolutional autoencoders.

Autoencoder architecture, source:http://dkopczyk.quantee.co.uk/dae-part1/

Traditionally, convolutional autoencoders are used for 2D data. In our case, the given technique was adapted to work with the signal by using 1D convolution and pooling layers. An integral part of each machine learning model is data. To train an autoencoder, we need pairs of the corresponding noisy and clean fragments of the signal to feed them as input and output accordingly. Obviously, it’s literally impossible to record both clean and noisy ECG signals simultaneously, so we have resorted to the next trick. Firstly, we took a lot of clean records from Mawi Band, then we added randomly sampled electromyogram (EMG) signals from another collected dataset over selected ECG records, obtaining “noisy” samples close to the real world noises that can come with muscular movements. Finally, we trained an autoencoder to minimize reconstruction error. Moreover, our colleagues from SoftServe went further and applied generative adversarial networks for the same problem:

https://github.com/softserveinc-rnd/ecg-denoise

Morphological annotation

Annotation is a crucial procedure for ECG analysis. It’s aimed to find important points called R-peaks that correspond to heart contraction events. Knowing these points allows us to obtain a sequence of distances between all successive beats, given sequence represents a heart rhythm and can be used further for HRV analysis. Besides R-peaks, there are P, Q, S, and T points which characterize the heart working cycle and also should be annotated for deeper analysis.

Visualization of a single heartbeat’s morphological parts

There are lots of different algorithms for automatic ECG annotation. Hamilton’s algorithm and its modifications, in particular, Pan-Tompkins are the most used. Given algorithms are based on the set of rules that accurately localize locations of R-peaks. In practice, described annotator works pretty well only on a relatively good signal, if a weak noise is present, even with clearly distinguishable R-peaks, they can be misannotated. Rule-based annotator may also fail by confusing R-peak with the top of T-wave when it’s higher as on the image below:

Example of ECG with a very high T wave, https://litfl.com/de-winter-t-wave-ecg-library/

We found out, that occurrence of noise significantly decreases the quality of annotation, which in its turn corrupts RR-intervals plot and subsequently certain HRV features (especially frequency domain). Thus, we decided to upgrade our annotator by replacing the rule-based model with a trained machine learning one. The key thing was to correctly choose input/output data and cost function to optimize. Firstly, we end up with a structure that (again) looks like autoencoder and takes a short (several seconds) ECG fragment as input, while the output is the same length sparse vector with ones on the positions of R-peaks and zeros on all remaining positions (in a simple version). After training, it gives the following results:

Examples of R peaks found by the first version of the neural network

The slightly more enhanced version, that also can predict P and T waves alongside with QRS complex works like following:

An example of the P wave, QRS complex and T wave found by a neural network

Anomaly detection / signal validation

So we know how to filter noise, but what if it is so powerful that there is nothing left from the original ECG signal? On the first graph of the following picture, you can see such a situation. Due to limited analog-to-digital converter resolution and high skin resistance signal is converted to a flat line. It’s possible to recognize an ECG pattern in such a signal, but you shouldn’t analyze it. So you need to create a module that understands whether the signal is good or not.

Examples of totally anomalous signal (top), slightly corrupted signal (medium) and the clean one (bottom)

The practitioners also solve this task, and if you have ever had problems during ECG measurement in the hospital, you are familiar with the most straightforward validation process. It’s often done manually by doctors that measure ECG. They adjust the system until the signal looks good enough. But it doesn’t mean that it’s always done manually.

Basically, ECG validation means understanding how high is the quality of the signal. And you can do it by annotating ECG and measuring the “normal heartbeat” — calculating mean among all of the beats. Then the distance between each beat and “normal heartbeat” may be considered as signal quality. If the ECG quality is too low or the peak was annotated incorrectly — the distance would be calculated between “normal heartbeat” and random signal interval. And so it would be very high. But this method has some substantial disadvantages — you can’t use it in real time, and it’s highly dependant on the quality of annotator.

High-quality ECG beats compared to low-quality ECG beats

Another algorithm of signal quality estimation is more complicated from the implementation point of view but is based on a simple idea. To distinguish noisy and clean signal you need to create a dataset of noisy and clean signals and then teach an algorithm to differentiate between them. There is a vast variety of ways to implement such an approach. To keep it simple we won’t focus on all of the experiments we’ve done implementing this approach and will tell you the results of them. We’ve chosen an architecture that uses convolutional layers for dimensionality reduction and then a simple classifier for signal quality estimation.

CNN architecture for signal validation/anomaly detection

But to create a perfect signal validator we need to go even further and perfectly understand the purpose of using it. Let’s consider that we are creating a product that uses ECG signal for HRV analysis. In this case appearance of baseline wandering or noise that makes it harder to detect P-waves of ECG is not important for as long as we can correctly identify R-peaks.

On the pictures below you may find two examples of ECG signal with already annotated R-peaks. Both of them were marked as noisy by an algorithm described previously. The first one looks less noisy but has 2 skipped R-peaks while the second has more powerful baseline wandering and yet all the peaks are annotated correctly. So the first segment should be excluded while the second could be successfully analyzed. This case shows us that to create a perfect ECG validation algorithm you need to take into account not just the general problem formulation (like anomaly detection), but also exogenous variables or auxiliary goals related to the problem that you are trying to solve.

Examples of two noisy signals, where R peaks are annotated wrongly (left) and correctly (right)

And so, the universal ECG validation algorithm wouldn’t be the best option for most of the products. The same signal could be considered acceptable in one case and too noisy in another. For example, to validate the ECG signal in the HRV analysis case, the “confidence” of R-peak annotation should be considered as a signal quality of ECG. The situation is similar for all of the facts that we are working with, and so we are using a universal validator only on the prototyping stage.

Atrial fibrillation detection

Atrial Fibrillation is the most spread type of arrhythmia that leads to a severe worsening of health condition and increased risk of heart failure, dementia, and stroke. Early detection and proper medication may prevent negative effects. However, signs of AFib may occur periodically, so can be not detected during hospital screening or even 24-hour Holter recording. Short records several times per day may increase the probability of detection based on given research. The growing popularity of wearable ECG event recorders makes it almost impossible to analyze all records by practitioners, so requires algorithms for robust analysis and, in particular, arrhythmia detection.

As its name suggests, the arrhythmia is a disease that causes rhythm abnormalities. Lorenz plot (aka scattergram) is a simple and robust tool for inspection of rhythm irregularities. Coordinates of the points on this plot correspond to current RR-interval length on the x-axis and subsequent RR-interval length on the y-axis. If the majority of plotted points are placed close to the main diagonal, arrhythmia considered to be not detected, while the severe variation of points reflects the presence of an arrhythmia.

Lorenz plots of sinus rhythm and AFib rhythm

Despite its simplicity and interpretability, the method based on scatter plots has several disadvantages. First, it requires human analysis; second, it does not distinguish harmless sinus arrhythmia and AFib. It happens because AFib causes not only rhythm disorders but also morphological changes in ECG signal like the absence of p-wave and occurrence of the f-waves highlighted by blue and red arrows correspondingly on the image below.

Source: https://en.wikipedia.org/wiki/Atrial_fibrillation#/media/File:Afib_ecg.jpg

Since features of AFib are a well distinguishable pattern, it can be assumed that deep CNN will handle this task. Fortunately, it’s not far to seek. Sanford ML group proved the power of deep convolutional networks by publishing a paper “Cardiologist-Level Arrhythmia Detection With Convolutional Neural Networks” in which they described a 34-layer model with residual connections. The test set for the model was evaluated by a committee of three cardiologists and the classification performance that was shown by the model in detecting a wide range of heart arrhythmias (including AFib) from single-lead ECG records was human-level. We were very inspired by given paper so decided to train our own AFib/not AFib detector using CNN.

Comparison of different neural network architectures accuracies on AF detection datasets

We took a well known MIT-BIH Atrial Fibrillation database and Physionet Challenge database to make training, validation and testing data set. A model contained 5 residual blocks with the same structure as a residual block in the ResNet but rebuilt to deal with 1D inputs. After training during 80 epochs, model shown quite good performance with accuracy 98.98 on the test set. Moreover, we used this model to test on real data from the hospitals and it shows the same decent accuracy, thanks to transfer learning abilities!

Summary

In this article we have described how we have replaced the “classical” algorithmic approach to the main stages of ECG processing:

Signal filtering
Anomaly detection
Morphological annotation
Classification

with corresponding ML-based alternatives. We would like to encourage other researchers to apply the same or similar approaches, to overcome straightforward, inaccurate and, often, inefficient routines from digital signal processing toolbox.

P.S. If you’re interested in biosignal analysis research or you want to apply above-mentioned approaches in your ECG applications, we invite you to Mawi Research Platform! We offer you our device and all supporting analytics for your own experiments, so you don’t have to figure out how to extract raw data from expensive medical market devices and can just concentrate on research :)