Deep Learning — What’s the hype about?

Healthcare Part 2

Published in

Deep Neuron Lab

6 min readFeb 18, 2019

AI is transforming the health industry. As a society we are becoming more data hungry than ever before, and this is also evident at an individual level though our growing fascination with wearable technology and e-health. Previously, we’ve summed up AI and deep learning from a beginner’s perspective, and discussed some of their use cases in healthcare, specifically medical diagnoses. Below, we continue in healthcare, providing a brief overview of deep learning in drug discovery, e-health and electronic health records.

Drug Discovery

Drug discovery from idea conception to a marketable product, can take over decade, and on average costs US$2.6 billion. The potential of deep learning in big pharma was highlighted in 2012, when a deep learning algorithm won the Merck Molecular Activity Challenge. Given numerical descriptors of 15 molecules’ chemical structures, the objective was to create the best technique for predicting on- and off-target biological activities of different molecules. The winning solution comprised of single-task neural networks, multi-task neural networks, and Gaussian process regression, and used minimal pre-processing and no feature engineering. Since then, many big pharma companies have partnered with AI companies. In 2012, Merck partnered with Numerate, in 2016 Pfizer paired with IBM Watson, and in 2017 GSK entered a €37M drug discovery collaboration with Exscientia.

The stages of drug discovery, as outlined here, are highly-specific, regulated and time-consuming. To date, AI and deep learning have been implemented in the early stages of the process, such as:

screening large databases for compounds that could be potential drugs;
refining potential compounds;
optimising the lead compound; and
predicting how the lead compound will perform in testing

BeneloventAI, a company combining AI with drug discovery, has taken advantage of the mass digitisation of information, creating an AI platform which leverages on natural-language processing to analyse masses of research articles, patents, clinical trials, scientific databases and patient records to create new insights for drug discovery. The platform can infer relationships between information like symptoms, diseases, drugs, and genes, allowing for the creation of knowledge graphs, thus helping to identify potential drugs and drug targets. Similarly, IBM Watson for Drug Discovery uses AI to find new insights for drug discovery by trawling through their database, which in 2016 was reported to have over 25 million abstracts, 1 million full-text journal articles and 4 million patents.

AI has also been extensively used within quantitative structure–activity relationships (QSAR), an area of computational modelling within biochemistry that predicts how the activity of a compound changes when its structure is modified. This was ultimately the aim of the 2012 Merck Molecular Activity Challenge. Initially, support vector machine and random forests were used, but now studies are showing that deep learning produces more optimal results (Ma, Sheridan, Liaw, Dahl, & Svetnik, 2015).

E-health and Wearables

Global estimates suggest that over 2.5 billion people use a smart phone. As such, companies, governments and healthcare institutions have realised that importance of e-health (tele-; mobile-). With an increasing number of wearables available for smart phones, and an ever-growing list of health apps, there is an unbelievable amount of health data being tracked and collected. This is evident by the recent partnership between Fitbit and Google. With the aim of providing consumers with more meaningful health data, Fitbit will use Google’s Cloud Healthcare API, and thus AI capabilities, to combine data from its wearables with electronic health records (EHRs). Apple is following suit, recently unveiling their latest Watch Series 4, which has the function to act as a single lead electrocardiogram (ECG) and provide respective health alerts. This means after the user captures their ECG on their watch, an algorithm which has been trained on labelled ECG data, can predict and then notify the user if they have a suspected arrythmia. Ultimately, wearables allow individuals to have direct access to their tracked health data, empowering them to actively monitor their own health which undoubtedly leads to better health outcomes.

An exciting use case was published in 2018 by Cardiogram, who have created DeepHeart, a semi-supervised multi-task LSTM designed for cardiovascular risk prediction. Ballinger et al. (2018) combined EHRs with heart rate and step-count data from 14,011 Apple Watch users. Using semi-supervised sequence learning and heuristic pretraining algorithms, they were able to detect diabetes (0.85), high cholesterol (0.74), high blood pressure (0.81), and sleep apnoea (0.83), with rather high accuracy. What makes this research even more exciting is that semi-supervised sequence learning has been around for a while; it has been commonly used in recommendations systems by companies like Amazon and Netflix. As such, implementation of such technology is likely imminent. Perhaps the largest limitation of deep learning for e-health, is that due to hardware restrictions, much of the processing itself cannot be completed on the wearable or phone. However, with the continuous advances in hardware and more innovative algorithms, deep learning could quite realistically revolutionise healthcare through e-health.

Electronic Health Records (EHRs)

EHRs are structured databases of patient health records that allow for streamline sharing of data across health care settings. They may include general information such as patient demographics or immunization status, or more detailed information such as vital signs during last hospitalisation. There has been much interest in using EHRs to predict future health status and thus improve health outcomes. However, data from EHRs can be extremely noisy, heterogeneous and incomplete. For example, numerous different words, vitals, or measurements could be used to describe a single illness. This makes it difficult to implement AI without a domain expert (e.g. medical professional) labelling the data, which is not feasible on large datasets. Nonetheless, unsupervised deep feature learning appears to be successful, with Miotto, Li, Kidd, and Dudley (2016) showing a three-layer stacked denoising autoencoder could predict future disease in individuals better than current clinical standards. Similar use cases have also been illustrated by Lasko, Denny, and Levy (2013) and Pham, Tran, Phung, and Venkatesh (2017).

The value of EHRs to predict medical outcomes is also beginning to be realised by the tech giants, evident through a recent collaboration between Google AI, UC San Francisco, Stanford Medicine, and The University of Chicago Medicine. Firstly, Rajkomar et al. (2018) used a generic data processing pipeline to convert raw EHR data input to Fast Healthcare Interoperability Resources (FHIR) outputs, a modern data format standard. This means the data can be somewhat normalised without the need for manual feature harmonization, thus making different EHRs from different hospitals comparable. Secondly, using 46,864,534,945 data points from 216,221 hospitalisations, the authors were able to train a LSTM, FFN with time-aware attention, and a boosted embedded time-series model to accurately predict numerous outcomes. These included outcomes like in-hospital mortality, prolonged length of stay, and the patient’s final discharge; accuracy was better than current traditional clinically-used predictive models.

The examples discussed in both this post and the previous, are just some of the many use cases of deep learning in healthcare. With increasing interest and thus investment into AI, the possibilities to transform and revolutionise healthcare are almost endless. Nonetheless, to avoid repeating mistakes of the recent past, as Cristea, Cahan, and Ioannidis (2019) suggest, we must ensure that digital healthcare is still held to a high level of scientific scrutiny.

References

Ballinger, B., Hsieh, J., Singh, A., Sohoni, N., Wang, J., Tison, G. H., . . . Pletcher, M. J. (2018). DeepHeart: Semi-Supervised Sequence Learning for Cardiovascular Risk Prediction.

Cristea, I. A., Cahan, E. M., & Ioannidis, J. P. A. (2019). Stealth research: Lack of peer-reviewed evidence from healthcare unicorns. European Journal of Clinical Investigation, 0(0), e13072. doi:10.1111/eci.13072

Lasko, T. A., Denny, J. C., & Levy, M. A. (2013). Computational Phenotype Discovery Using Unsupervised Feature Learning over Noisy, Sparse, and Irregular Clinical Data. PLOS ONE, 8(6), e66341. doi:10.1371/journal.pone.0066341

Ma, J., Sheridan, R. P., Liaw, A., Dahl, G. E., & Svetnik, V. (2015). Deep Neural Nets as a Method for Quantitative Structure–Activity Relationships. Journal of Chemical Information and Modeling, 55(2), 263–274. doi:10.1021/ci500747n

Miotto, R., Li, L., Kidd, B. A., & Dudley, J. T. (2016). Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records. Sci Rep, 6, 26094. doi:10.1038/srep26094

Pham, T., Tran, T., Phung, D., & Venkatesh, S. (2017). Predicting healthcare trajectories from medical records: A deep learning approach. Journal of Biomedical Informatics, 69, 218–229. doi:https://doi.org/10.1016/j.jbi.2017.04.001

Rajkomar, A., Oren, E., Chen, K., Dai, A. M., Hajaj, N., Hardt, M., . . . Dean, J. (2018). Scalable and accurate deep learning with electronic health records. npj Digital Medicine, 1(1), 18. doi:10.1038/s41746–018–0029–1