TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Member-only story

How I Predicted the Effect of Mutations on Protein Interactions Using AlphaFold

21 min readMay 7, 2024

--

By Author

The human interactome (all protein-protein interactions) may number up to 600,000 interactions.

With so many possible protein-protein interactions (PPIs), predicting how a disease-causing mutation affects the interactome seems like a Herculean task — but not as impossible as you might expect.

(Especially when you give a University of Waterloo co-op student free access to a beefy GPU cluster, world-class mentorship, and free agency to pursue any approach).

Using the machine learning framework XGBoost, cutting-edge deep learning software AlphaFold-Multimer (AF-M), and over 47,000 SLURM jobs, I built a multi-classifier model that predicts the effects of missense mutations on PPIs with a 91% AUC.

Multi-class ROC curve and AUC (By Author)

In this article, I'm going to walk through:

  • The Background: The research question and why we chose it.
  • Data Acquisition & Processing: How and why we…

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Murto Hilali
Murto Hilali

Written by Murto Hilali

22-y/o tech enthusiast/work in progress. All things business and biotech. More at murto.co

No responses yet