#ResearchToThePeopleEducation: An open, educational platform for better biomedical research hackathons.

Lily Vittayarukskul
SVAI
Published in
5 min readApr 17, 2019

INTRODUCTION

At SVAI, we want to facilitate and support a research community for underserved patients — focused on the undiagnosed and rare disease community. ⭐

More specifically, we’re interested in the expandability of collaborative research and new models of how discovery happens. Our research series started with one patient with a rare neuro-cancer: NF2, then we’ve hosted another successful AI Genomics hackathon focused on p1RCC.
Now, we will host the first hackathon for a patient w/ a heart of gold and undiagnosed metabolic disease: Undiagnosed-1. Sign up here! ✔️

One of my main goals at SVAI was to improve the transparency, interpretability, and effectiveness of the community research done at SVAI. Since I’ve joined, I identified four major bottlenecks in doing the research:‌

  1. Understanding the Data
  2. Hacking on the Cloud (specifically, Google Cloud Platform)
  3. Understanding the biological aspect of the research
  4. Understanding the artificial intelligence aspect of the research

Also since then, I created an open, educational platform to address these bottlenecks called Research to the People, v.1.0.

Platform is hosted on Gitbook!

Why this Platform? 🤔

Research to the People hosts resources and knowledge to tackle these major bottlenecks, so that we can then effectively deep dive into illuminating relevant tools and approaches. The hope is then that with the help of your talented mind collaborating with other talented minds in this community, we creatively repurpose and utilize these relevant tools and approaches to tackle any of these three major areas of research (relevant to the patient):

  1. patient categorization
  2. fundamental biological processes underlying human diseases
  3. treating, or developing new treatments.

Addressing the Bottlenecks 🍾

  1. Understanding the Data 📊

We dive into the data that allows our research community to yield robust, promising insights into understanding a patient’s clinical case. Depending on the type of disease the patient has, we gather a unique portfolio of data. That data may be genomic/genetics, transcriptomic, proteomic, metabolic, and/or clinical.

Minimally, we try to gather clinical and genomic data. A great next plus would be transcriptome data.

2. Hacking on the Cloud (specifically, Google Cloud Platform) ☁️👩‍💻

In order to access data, and hack onto it, you need to be assigned to a project on Google Cloud Platform (GCP). Once you have a project on GCP, you’ll learn how to use the Storage Engine and the Compute Engine. You’ll mainly be guided through how to store, access your data (in buckets) and how to compute on the data in the cloud (on a VM).

Architecture for running a distributed training job on Cloud ML Engine and using Cloud Datalab to execute predictions with your trained model.

3. Understanding the biological aspect of the research 🧬

Understanding biology leads to majority of groundbreaking insights into biomedicine. The point of this section is to get you familiar with fundamental concepts underlying many powerful approaches to better understand a particular disease or discover relevant insights.

A snippet of the genomic application code walk-throughs 🚶

4. Understanding the artificial intelligence aspect of the research 🤖

Machine learning and deep learning approaches have been fundamental to many groundbreaking insights into biomedicine. The point of this section is to get you familiar with fundamental concepts underlying many powerful algorithms — basically, you’re getting some crash courses. This section is by no means comprehensive, but don’t worry, we’ll share a few potentially helpful resources in this section.

Tying it all together 🎁

Finally, what does a good marriage between biology and AI look like? In two sections, we’ll share the a toolbox, the AI-Bio Toolbox, and our most popular application, Cancer Analysis Approaches.

  • AI-Bio Toolbox: relevant and very powerful tools for applying AI towards biomedicine. The tools introduced here are biased towards deep learning approaches. You might see some of these packages applied in code-walkthroughs throughout this notebook. (Note: This overview is definitely not comprehensive — a pretty good list of deep learning tools I’ll reference quite a lot exists in this github.)
  • Cancer Analysis Approaches: We summarize traditional ML for Cancer, DL for Cancer, and deep-dive into how to (1) Generate and Classify Mutational Signatures, and how to (2) De-noising autoencoders to extract features from breast cancer gene expression data.

Potential Next Steps 🐾

You’ll notice some empty sections — I haven’t had the opportunity to make this as comprehensive as I’d like, but this might be your opportunity to contribute to this open community fascinated with future of computation and the expandability of Biomedical-AI scientific collaboration.

Have fun exploring Research to the People and let me know if you have thoughts for improvement! ✌️

📢P.S. SVAI is hosting its first hackathon for a patient w/ a heart of gold and undiagnosed metabolic disease: Undiagnosed-1. Sign up here! ✔️

--

--