Machine learning in medical research

A prompt for improving the future

Robert Herman
Powerful Medical
4 min readOct 10, 2017

--

Data collection at it’s peak.

It is a common misconception that methods from the IT-world such as machine learning have still yet to be introduced to the medical world. 21st century technological advancements have provided the resources for a long awaited and necessary innovation in medicine.

In spite of the obvious advantages, expert systems cannot and will not (in the near future) replace human experts. We’re far away from the automatization of entire healthcare systems. However, we can provide technological solutions which could bring precision and effectiveness.

Statistics in medicine

Due to the fact that research sets the basis for important clinical decisions made in the day-to-day life of a doctor, it is crucial that this information is supported by precise mathematical and statistical evidence. In medical research, we recognise two major branches of statistics:

  • Descriptive statistics is the summary of a collection of data, which gives us insight into the past. Using basic arithmetic calculations like sum, mean, variance or hazard ratio we transform data to be interpretable by anyone.
  • Predictive statistics uses information gained from descriptive statistics to forecast events which might occur in the future. It deals with the identification of patterns and construction of statistical models and algorithms.

By applying this knowledge, we can break down the most frequent task of a doctor — diagnosing a patient into the following steps:

A known medical example

The GRACE score study is a perfect application of both branches of statistics to a medical problem. It deals with the estimation of patient outcome (death or myocardial infarction) in patients presenting with acute coronary syndrome (ACS).

The researchers’ approach to solving the problem is outlined by the following steps:

  1. Collection of data from ninety four hospitals resulting in a dataset of 43,810 patients.
  2. Description of the gathered variables (age, comorbidities, symptoms, etc.) using basic calculations, such as the hazard ratio — examining the individual relation between each variable and patient outcome.
  3. Identification of patterns and construction of a statistical model using logistic regression.
  4. Prediction of patient outcome and calculation of the accuracy and performance of the predictive model.

The predictive model was then validated on a set of Canadian patients and is now used as a tool to facilitate triage and management of ACS patients globally.

Download MDCalc on the Appstore

Machines assisting doctors

A trivial definition of machine learning is teaching a computer to recognise patterns and predict results based on data provided, without being explicitly programmed.

Models and methods in machine learning are further divided based on the type of problem we are trying to solve:

  • Classification is the determination of a label or category. The process of finding a correct diagnosis, based on symptoms, patient history and test results would therefore be a classification problem.
  • Regression is used when predicting a number — for example the amount of days until an event (death or patient discharge from the hospital) occurs.
  • Clustering is the recognition of patterns in data. An example from epidemiology would be the identification of risk factors of a specific disease.

From the examples listed above we can see, that the description, identification of patterns and prediction of patient outcome are all machine learning methods used in data analysis, however, there is so much more to it.

Click the image for an interactive diagram.

It’s time for a new era

Both machine learning and artificial intelligence are terms described in the 1950s. Back then algorithms were defined and calculated on paper. The following factors account for the rising interest in innovation in this field:

  • Processing power plays a major role in the construction of a complex algorithm. A billion-fold increase in computing power since 1956 allows us to analyze large amounts of data using only our laptops.
  • Data collection is at its peak in the 21st century. As more hospitals turn away from paper and wearable devices hit the market, we are transitioning into a data driven world.
  • Technological literacy among the younger generation of doctors and the awareness about the importance of data has created a booming demand for new technological solutions in medicine and research.

So far, the use of machine learning in medicine has been limited to simple algorithms that are prone to being inaccurate. More complex tasks will become solvable as the future brings more technological advancements.

However, medical practitioners across the world should not panic over being replaced by machines. On the contrary, doctors will find these advancements useful in helping them carry out their jobs with greater precision and efficiency.

Man and computer are capable of achieving together what neither of them can alone.

--

--