Biology is riddled with little questions that deserve a quick, data-driven answer.

Jake Feala
Apr 23, 2017 · 5 min read

In the spirit of creating catchy, one-sentence product descriptions, I’m writing this post to highlight the need for

a place where scientists can submit a question and get back an expert data analysis.

Read on if this interests you!

An unmet need for data-driven answers

In my experience, biologists often come across data-related questions that could impact their research, but they don’t know where to get answers. Worse, they don’t know what they don’t know.

For example, what are the applicable data sets, methods, and published answers related to their question? How can they use the massive repositories of online, publicly available data to answer their specific question? What are all of the important caveats and gotchas unique to each dataset?

Often, an experienced bioinformatics scientist can help to find the relevant data and address the question with data analysis and visualization, but access to this expertise is usually limited to either

  1. tapping on the shoulders of your in-house bioinformatics group (who are already swamped and might have a narrow range of expertise), or
  2. launching a large, expensive project with an external consulting group.

There is a large “activation energy” to both of these options. As a result, important but “small” questions are set aside and research efforts forge ahead, even though a crucial bit of knowledge could have dramatically steered their research in a more fruitful direction.

It’s very hard to know precisely how often this happens, or how much money and effort is wasted due to missing information that could readily have been provided by an expert in bioinformatics. However, the massive investment in data science across all industries, as well as the current shortage of bioinformatics talent within biopharma, suggest that the problem is very real and executives are aware of it.

I believe that, within the R&D groups of life sciences companies of all sizes and in industries ranging from agriculture to biotech, there lies a massive hidden demand for data-driven answers. Specifically, my hypothesis is that:

There is an untapped market of unanswered, data-related questions in the day-to-day practice of science in the biotech industry, which could be answered by bioinformatics experts external to the organization.

Experts on demand, for questions of all sizes

Consulting in professional services has traditionally been reserved for large projects, with budgets in at least the $10’s of thousands, initiated from the “top-down” by upper management. This makes plenty of sense for the many problems that are well-known, require extensive onsite interviews to assess the problem, weeks of dedicated effort by a large team to implement, and deep integration of solutions with internal systems and processes.

However, much of the value of having bioinformatics as part of your research team simply lies in the ability to approach complex biological questions with a data-driven mindset, as well as the tool sets possessed by experts in software and data science. There is no correlation to the size and difficulty of the problem and whether an external expert can provide a valuable answer. In my career embedded with scientific teams, I’ve encountered many “one-off” questions that I’ve been able to answer with public data, a simple scatterplot, and a bit of prose. Though the time and effort to address these questions was small, the answers often have had major impacts on the thinking and future directions of the research group.

Therefore, I believe there is a market for a new, 3rd solution in addition to the ones above — microconsulting. In a microconsulting framework, any scientist in the organization can submit short, directed questions to a centralized platform, and would receive a rapid, data-rich response from a pool of carefully vetted external experts. The resulting data could be easily shared across the organization, if desired. In contrast with traditional consulting, the impetus for requesting outside expertise would come from the “bottom-up,” in other words, from the scientists working directly on the research problems.

What would a microconsulting solution look like?

I can envision a platform in which certain questions would be submitted online, much as they would by typed into Google or posted to StackOverflow (a common Q/A forum for software developers). However, the questions could be kept private at the level of the user or institution, when sensitive information is involved. A prioritized list of questions would be kept in a ticketing system or queue, and a team of experts would collaborate to provide answers in order of priority. Importantly, and unlike StackOverflow, Quora, or other existing forums, the answers would only be provided by a preselected pool of experts, rather than the general public.

Ideal questions for this framework would have the following properties:

  • Self-contained within a couple of paragraphs
  • Rely only upon readily available public data, or a provided data set
  • Can be answered in a few hours some data curation, analysis and/or visualization, a few figures, and some short prose that tells a story with the data.

The following are some real-life examples, drawing from my own areas of expertise:

  • Do silent mutations in p53 cause p53 alternative splicing?
  • What is the complete list of human E3 ligases?
  • What DNA sequence motifs are over-represented in the attached list of transcription factor binding sites?
  • How much oxygen is consumed and ATP produced from the complete oxidation of 1 glucose and 1 glutamate?
  • What percent of patients have both germline BRCA1/2 mutations and copy number loss of PARP1?
  • Is KRAS expressed when it has p.G13D mutation?
  • Which nuclear genes correlate in expression level with mitochondrial genes across normal tissues?
  • In internal RNA-seq dataset 123, available through the attached link, we treated X cells with Y conditions. A new publication, pdf attached, suggests pathway Z might be involved. Do genes in Z change with treatment Y?
  • Does p53 mutation knock down its expression?

This style of question may not be unique to the life sciences, and if so, the platform could be extended to other data-intensive industries.

A happy home for the experts

Decoupling the source of questions (biologists) from the source of answers (bioinformatics) provides other benefits as well.

Bioinformatics scientists are a unique breed, and often have vastly different needs and core values from the rest of a scientific organization. Due to the nature of their work, they probably

  • prefer, and are more productive, working remotely
  • require deeper control of, and more resources from, their IT environment (e.g., cloud access), as well as their own laptops
  • enjoy working with other computational and data experts, but have only a handful of likeminded experts within their organization
  • yearn for modern communication tools that fit their working style, but are stuck with IT choices catered toward biologists and chemists

By remaining independent from any individual biopharma company, it would be possible to tailor a working environment specifically for bioinformatics experts, in which remote work is encouraged, modern tools are provided, and an internal culture of software and data geeks can thrive. This would have massive potential for recruiting the top expertise from a heavily recruited talent pool.

Announcing a microconsulting pilot program

I’m excited to announce that my consulting company, Outlier Bio, is now launching a free pilot of its microconsulting platform, based on the tenets above, for a select list of customers. We will be expanding this pilot group slowly, as we grow our capacity, but please contact us if you or your company is interested in participating.

To sign up for our free microconsulting pilot program, please send a request to answers@outlierbio.com

Outlier Bio blog

Thoughts about the field of bioinformatics, and how to make it better

Jake Feala

Written by

Full-stack genomics data engineer. Independent consultant. Entrepreneur in a love-hate relationship with the field of bioinformatics.

Outlier Bio blog

Thoughts about the field of bioinformatics, and how to make it better

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade