Machine Learning Applications in Biomedical Science

Learn how machine learning is revolutionizing biomedical science.

Payal Kumari
Geek Culture
10 min readDec 26, 2021

--

Who should read? 📖

In my previous article, I explained the concept of data analysis and its tools that are used to understand and analyze large data sets available today. In this article, we’ll look at how machine learning can help solve the complexity and rise of data in healthcare.

What to expect? 👀

This article dives into the application of machine learning towards improved diagnosis and treatment of diseases. In addition, you will also learn the opportunities for machine learning in biomedical science.

Machine learning, as defined by Arthur Samuel in 1959, is the field of study that gives computers the ability to learn without being explicitly programmed. In other words, Machine learning is a subset of AI techniques that uses statistical tools for teaching computers to do tasks that humans can do better than machines. There are many applications of this approach ranging from self-driving cars (Bojarskiet al., 2016) to product recommendations (Batmaz et al., 2019) to Netflix, moving from a postal DVD lending system to video streaming providers. In the health sector, Machine learning has been used to find useful patterns in large datasets, such as the human genome project(Venter et al., 2001), and cancer omics (Tomczak et al., 2015). This is useful for early, more accurate diagnosis and ongoing monitoring in support of overall health.

https://imgflip.com/i/5ym1r8

For a consummate development of human health, algorithms that are predicated on machine learning principles have the ability to produce accurate predictions. However, these algorithms are heavily dependent on the type of problem to be addressed.

There are a myriad number of approaches that are unique to different biomedical problems. In this article, you will learn about seven of these approaches.

https://towardsdatascience.com/what-are-the-types-of-machine-learning-e2b9e5d1756f

Supervised, Unsupervised, and Reinforcement learning

Supervised Learning

Supervised learning

The name supervised implies that there is a supervisor, as a teacher, present who is providing some instructions. In this case, we train the machine model using the input data that is already labeled. Once the model is trained, we provide a new set of input data to the model to analyze and predict the correct output from labeled data.

Let’s look at an example of a basket filled with different fruits. Our first step will be to train the model with all the different fruits by doing something like this:

If the fruit color is green, the shape is an oval and the size is small then it's labeled as a grape. If the fruit color is red, the shape is round, and has a depression on the top then it's labeled as an apple.

Suppose we bring a new apple and ask the model to identify it. Then based on the trained data, it will classify it using the features such as color, shape, and size and confirm the fruit name as an apple.

Unsupervised Learning

Unsupervised learning

Unsupervised learning is analogous to a newborn baby. As a newborn baby who doesn’t know who is male or female or who is good or bad, the newborn baby only has input that comes from observing patterns, differences, and similarities and tries to group them based on it. Likewise in unsupervised learning, the machine does not know anything beforehand and had to find a hidden structure based on the unlabeled data.

Reinforcement learning

Reinforcement learning is based on a reward and policy. Let say you have an agent and that agent performs an action in the environment. Based on the performance of the action, the agent either gets a reward (positive) or a penalty (negative). It is also accompanied by state change which means something has changed in the environment. Now, the new policy is formulated and it keeps renewing until the agent learns whether it should follow the same steps or not.

https://www.inwt-statistics.com/read-blog/reinforcement-learning-for-marketing-lessons-and-challenges.html

Classification and regression

Classification and regression learning falls under the category of supervised learning methods. Classification predicts discreet values such as normal versus diseased or red versus blue or 0 or 1. Regression learning predicts continuous value as the output such as the strength of response to therapy or price range.

Let's understand these two with a real-life example:

You’re building a model to predict whether a given customer is eligible for a loan or not, and the bank will provide all of the customer’s information; all you have to do now is predict whether this customer is eligible or not. This is a classification problem because the output is predefined; we know it will be either yes or no.

You are developing another model for a website that predicts a price of a car. You will be given all the specifications of the car and the model is expected to predict the car price. This is a regression problem because it is a price range and price of the car that could be anything into a continuous range.

Ensemble learning

Ensemble methods build multiple models and use the average of all models to produce predictions. Common ensemble approaches include random forests, gradient boosting, and stacking or meta-ensembles.

Let’s try understanding with a simple example without going into technical details.

There’s a movie about a particular actor, lets assume his name is Chris Hemsworth. Five of your friends have chosen to go see the movie. Meanwhile, you have some work to do and have opted not to go, so you’re curious as to how the movie went. You’re going to ask your friends. Let’s say the first three of your friends thought the movie was good and the remaining two thought it was bad.

Now that you have the feedback from these five people, you will combine all of the information to determine whether the movie was a good or bad movie, and then you will decide whether to see it or not. This is exactly what happens inside any ensemble model, where information from various sources are combined.

What are these sources? These sources are individual models. From the example above , you can consider your friends to be individual models. Learning or response from your various friends in terms of data science terminology from various models, and then these models are combined and the final model is prepared. Here, the final model is you . Now, the question is: What is your decision? Your decision will be based on the decisions of all your friends feedback.

This model is widely used nowadays because it produces good results. This is as a result of the elimination of data bias.

What does data bias mean, or, in other words, model bias?

Let’s say one of your friends is a Chris Hemsworth fan. Even if the movie isn’t great, that friend may have a biased opinion about it; that friend may say the movie was awesome; however, this isn’t the final decision you’ll make. Instead, your final learning will be based on feedback from all of your friends; all of your friends may not be Chris Hemsworth fans, so they will provide neutral feedback.

When it comes to data, one portion of the data may tell one story that is potentially biased, whereas other parts of the data may tell a different story, and when many stories are integrated, we obtain a combined model. This entire process is known as ensemble learning.

Deep learning

Have you ever wonder how a text writtern in one language translate into a different language within a seconds? This is an example of deep learning.

Deep Learning is a subset of machine learning which in turn is a subset of artificial intelligence. It is inspired by the structure of the human brain. In terms of deep learning, this structure is called an artificial neural network. In machine learning, we feed the features to the model to differentiate between different classes, whereas, in deep learning, we don't feed the features because it tries to learn by the neural network without human intervention.

There are also two complementary approaches in machine learning that are used to improve the performance in biomedical applications and these are as follows:

Dimensionality reduction

Dimensionality reduction helps to reduce the number of attributes or features of a dataset by selecting important features or combining features to capture variance in a dataset. It is often used to improve the performance of machine learning models and to aid visualization.

Federated learning

Federated learning is useful when data are located in multiple clinical systems or when learning from sensitive personal data. It is especially important in many biomedical applications where data contain sensitive or protected health information that cannot be easily shared.

Let's understand what is Federated Leaning?

Normally when we train a machine learning model, we keep both the model and the data on the same device, which is known as centralized machine learning. This means that companies like Apple and Google upload our private conversations to the cloud to build their models.

The paradigm is flipped with federated learning. Rather than sending our data to the cloud, we send our models to our devices where we then train them locally. This implies that the data never leaves our device. After we’ve trained our model locally on the device, rather than sending data to the server, we send model updates to the server, and the server aggregates the model updates from each of the devices and updates the global model. We then repeat this process over multiple rounds of training, and in reality, we don’t train the model on all of the devices at once; instead, we sample just a fraction of those that are plugged in and idle at night, so you don’t actually see the model being trained on all of them.

How machine learning is driving biomedicine?

With technological advancement, medical professionals, scientists, and doctors are exploring the different ways machine learning can be applied in order to provide the best possible treatment to patients. One of such ways is by early diagnosis and staging of biological disasters such as cancer, brain tumor, diabetes, and several others. For the early diagnosis of biological disasters, there are three epochal areas that must be considered. These areas include clinical diagnostics, precision treatment, and health monitoring.

Future of machine learning

Machine learning is projected to have a significant impact on many elements of health care administration and disease monitoring. Early and more accurate disease detection, better diagnosis, and more permanent and bearable treatments could all be made possible by machine learning. Machine learning algorithms offer numerous prospects in three fields of biomedicine: clinical diagnosis, precision therapy, and health management and monitoring. Imaging and molecular testing, for example, now create orders of magnitude more data than in the past, necessitating the use of machine learning for automated interpretation. CAD software based on deep learning can understand biomedical images in the same way that medical practitioners can.

Because they combine the strengths of both humans and computers, the use of both humans and computers in diagnosing will become increasingly prevalent in the future. This will result in a more accurate diagnosis. Advanced clinical testing technologies combined with machine learning will need to weigh the tradeoffs between disease detection rates, patient outcomes, and other health-related parameters. By mining and searching expert knowledge in published literature and patient databases for useful information, such as biomarker-therapy relationships and biological pathways of interest, machine learning will assist improve precision medicine. Researchers can identify several types of skin cancer and diagnose diabetes using machine learning and photographs taken with cellphones (Esteva et al., 2017) (Micheletti et al., 2016).

Machine learning programs will be able to monitor individuals for any deviations from the norm and alert them when a change necessitates consulting with a medical professional, assisting in the accurate diagnosis and treatment of patients. Machine learning applications will be evaluated using a combination of retrospective data, iterative training, and deployment in a prospective context. A machine learning system will be able to adapt to changing health conditions or habits and generate accurate forecasts as a result of this.

References

  1. End-to-end Learning for Self Driving Cars
  2. A review on deep learning for recommender systems; challenges and remedies
  3. The Sequence of the Human Genome
  4. The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge
  5. Dermatologist-level classification of skin cancer with deep neural networks

Thank you for sticking with me this far. I hope you found this article useful and that it gives you some idea about machine learning transformation in biomedicine.

--

--