Foresite Labs: Supporting the Next Generation of Healthcare Entrepreneurs

Dr. Vikram (Vik) Bajaj
Foresite Labs Notebook
10 min readOct 31, 2019


We are announcing the creation of Foresite Labs, a center for entrepreneurial innovation at the nexus of data science and healthcare. Foresite Labs, based in San Francisco and Boston, will partner broadly to launch life sciences and healthcare companies that use the tools of data science to solve our greatest unmet medical needs.

Introduction: data science and public health

In the 19th century, infectious disease ravaged the poorest quarters of the industrialized European capitals. London was the site of one particularly lethal outbreak of cholera in 1854, notable not only for its cost in lives but also because the episode marked the birth of our profession; John Snow, a physician, would become its father. Snow sought to do much more for his patients than was possible according to the medicine of his time, which held that cholera was caused by noxious airs (“miasma” theory). Through meticulous research, Snow was able to plot the natural history of the epidemic on a map of London, revealing that it was centered about a water pump on Broad Street. In removing the handle of that now notorious pump, he halted the outbreak. Though it took decades for his results to motivate changes in urban sanitation, Snow showed that we can prevent disease without even understanding its biological basis.

John Snow’s Cholera Map, 1854.

Snow still speaks volumes in the era of big data. In modern language, he combined multidimensional data sets to develop a hypothesis about the cause of his patients’ illness, which then led him to devise a way to help that he could quickly test. This simple cycle of “observation-intervention-outcome” is still how we generate the most reliable scientific evidence. We believe that the tools of data science can greatly accelerate this cycle.

(top) Healthcare today: interventions are based on targeted observations, tested in carefully controlled trials, then deployed bluntly in a broad population. Each intervention is reimbursed. (bottom) Healthcare tomorrow: deeper or better observations result in continuous refinement of interventions, which are only reimbursed if they work.

How evidence will transform healthcare

Unfortunately, healthcare in today’s America has more in common with Snow’s London of 1854 than we normally admit. We spend an increasing fraction of our GDP on healthcare with returns that are diminishing in many populations. Due to wealth disparities and lack of access to quality medical care, we experience public health emergencies like opioid addiction, mental illness in all its forms, obesity, and the rise of adolescent smoking. These are no less lethal to us than cholera was for Londoners of the Victorian era. Our related scientific knowledge is vast and growing, and yet only a fraction of our clinical practices are grounded in reliable science. In fact, our most elegant science is often focused on the well-defined diseases of the few rather than the needs of the many. In spite of the enormous economic incentives to solve these problems, we’ve broken our covenant with the parts of society that need us most.

The ingredients of healthcare transformation.

Yet, Snow’s example also makes us optimistic about a potential solution. What if we could systematically accelerate the generation of reliable medical evidence? This is the promise of industrialized measurement, understanding and experiment — collectively the field of “data science” — that has transformed other industries. I know you’ve heard this before. At a company-wide meeting, I once explained to an audience of Google’s software and hardware engineers, product managers, and executives our reasons for starting a healthcare project: as scientists, even the questions we ask are circumscribed by the tools we have available to answer them. That’s why the best scientists are tool-builders. We came to Google because the best scientific tools of our generation were ones that the technology industry had built for a completely different purpose.

The last few years have established encouraging proofs of this reasoning. Our understanding of disease is being transformed by studies (UK biobank, deCODE) that use our genetic diversity — the grand experiments of nature — to teach us about human biology. Through powerful new tools like single cell sequencing, we have resolved the basis of disease in its fundamental cellular units. Next, the tools of functional genomics allow us to deterministically reprogram the genetic code of a model organism and confirm our hypotheses about what is causing a disease and how to repair it, connecting the laboratory to the clinic in ways that were unimaginable a decade ago. Similar efforts with diverse clinical data sets have given rise to the concept of “precision medicine,” in which we seek to understand an individual’s risk for a disease and respond to it. All these efforts depend on sophisticated computational tools ranging from instantaneously available distributed computing to artificial intelligence methods that yield superhuman insights from experimental data.

The increasing connectedness of life sciences and healthcare should be no less transformational than the information revolution was in many other industries. Without ever leaving her desk, a scientist today can already access data sets whose scale and scope eclipse the totality of biological information available a decade ago, resulting in an explosion of research output. We see examples within the mainstream R&D portfolios of the major pharmaceutical companies of programs constituted to leverage these possibilities. We also note, in sophisticated provider networks and systems, the development of precision interventions that seek to prevent or delay disease in high-risk individuals. While encouraging, these examples have not been transformational for patients. Unlike in other industries whose productivity has been altered merely by reducing the barriers to the flow of information, the transformation of healthcare will be more tortuous.

Delivering on the promise of data science

We believe that successful companies at the nexus of data science and healthcare will do everything possible to accelerate the generation of reliable evidence; less successful ones will address only part of this learning cycle. Ultimately, the products of these companies will be judged by the outcomes they achieve for patients and for society, and successful companies will have business models that reward them for achieving only the outcomes which matter.

We’ve found that these three factors more than others are differentiating:

  1. Having the right data. Companies generate valid hypotheses from data sets that are large and free from bias; no computational method can generate statistical power where none exists. As a result, few problems are solved by existing data alone, and most innovative companies will generate new data in the laboratory or the clinic. For example, while there is a great deal to learn about disease biology from population-scale genetic databases, they often do not contain enough patients suffering from a particular disease and are rarely balanced in terms of sex and ethnicity. Often, the needed data sets are so large that no single company can generate them alone, necessitating collaboration.
  2. Accelerating knowledge generation through experiments. Hypotheses that arise from observational data must be confirmed through experiment. The best companies not only plan these experimental programs in advance, but they deploy the tools of data science to systematically accelerate them. For example, companies using functional genomics approaches to validate drug targets will use modern methods, including machine learning, to design efficient experiments and read out their results. The laboratory of the future will involve precisely engineered model systems, ubiquitous automation or microfabrication, and machine learning-driven design of high content experiments; its scientists will work more behind the screen than behind the bench. Similarly, clinical product development will be transformed once we know more about target patient populations, the natural histories of their diseases, and what actually matters to them. In the ultimate limit, pragmatic studies, conducted in real world settings, can turn every patient interaction into a tool that generates evidence. All companies will have excellent software engineering practices that support pervasive experiments conducted reproducibly and rigorously at scale.
  3. Culture and people. The transformation of discovery and product development in the life sciences requires professionals with dramatically different preparations and expectations to share a common purpose. Rarely are such bridges possible to build in an existing organization. Successful companies have a culture of humility that facilitates collaboration within the organization and outside it, and generally it arises from a continual focus on what’s best for the patient. Often these companies won’t be located in Silicon Valley, but instead closer to the patients they serve.
A partial list of exemplary companies employing data science methods in the life sciences and healthcare. Each meets our three criteria for successful companies in this space.

Foresite Labs: Accelerating Healthcare Innovation

We are on the cusp of a transformation in healthcare. At the end of that transformation will be a modern healthcare system that efficiently serves the interests of the healthcare consumer, and critical to its realization is the systematic acceleration of the “learning loops” that generate evidence. What can we do to accelerate this transformation?

We believe that there is nothing wrong with healthcare that can’t be fixed by those already working in healthcare. There are a thousand epidemics and a thousand John Snows waiting to cure them. We don’t overestimate our influence and we recognize that all we can do is make it just a little easier for these visionary entrepreneurs to accomplish great things.

To that end, we are announcing the formation of Foresite Labs, an entity whose only objective is to support visionary entrepreneurs accelerating healthcare innovation. Our goal is to dramatically reduce the barriers for these entrepreneurs to test their ideas. We also want to create an ecosystem that will allow Foresite Labs companies and strategic partners to collaboratively undertake big projects that they would never complete alone. We will do that by giving each company some of the factors of success that I just outlined.

Foresite Labs Capabilities

Data and Experimental Capabilities: The Labs platform is engaged both in the amalgamation of existing data in a unified and harmonized environment and the prospective generation of genetic, molecular, and clinical data from cohorts of people and in laboratory experiments (e.g. functional genomics, high content imaging, etc.). These data-generating activities are concentrated in areas of interest for company incubation, but they are always formulated to solve problems that are more general, ensuring that the capabilities are generalizable.

Experiment Frameworks and Translational Research: Our group has produced scalable and secure cloud implementations in a broad suite of tools for primary data processing, genetic association analysis, causal inference, and a suite of machine learning methods together with rigorous experiment frameworks for the derivation of phenotypes and the design of experiments. Our translational research focuses on pushing the practical limits of causal inference and machine learning approaches applied to biological and clinical data sets. We intend to publish and make many of these tools available over time as open source projects. This platform also extends to the design, analysis, and control of high-throughput biology and chemical biology in the wet laboratory.

People and Culture: We have assembled an experienced and talented team of employees and advisors, and they stand ready to partner with new companies in the journey to independence. Our team, with offices in Boston and San Francisco and operations in New York, includes visionary statistical geneticists, clinicians and clinical scientists, data scientists/ML engineers, software engineers, laboratory scientists, and an experienced operations, legal, and business staff. They have tremendous flexibility to work on individual projects or contribute to many. They all have a shared ambition to transform healthcare for the benefit of our patients and a commitment to the highest scientific and engineering standards.

Senior management of Foresite Labs

  • Dr. Alex Blocker, Head of Data Science (formerly of GRAIL and Verily),
  • Dr. Rick Dewey, Head of Genomics Discovery (formerly of the Regeneron Genetics Center),
  • Dr. Damien Soghoian, Head of Operations and Strategy (formerly of Verily), and
  • Dr. Paul da Silva Jardine, Head of Drug Discovery (formerly of Pfizer).

Foresite Labs Scientific Advisory Board

  • Mathai Mammen, Global Head of R&D, Jannsen Pharmaceuticals, J&J
  • Paola Arlotta, Chair, Harvard Department of Stem Cell and Regenerative Biology
  • Euan Ashley, Director, Center for Inherited Cardiovascular Disease and Clinical Genomics Program; Co-Director, Stanford Data Science Initiative
  • Calum MacRae, Vice Chair, Scientific Innovation at the Department of Medicine at Brigham and Women’s Hospital
  • Steve Finkbeiner, Director, Taube/Koret Center for Neurodegenerative Disease Research at Gladstone; Investigator, Roddenberry Stem Cell Center
  • Alex Aravanis, Chief Scientific Officer, Head of R&D, and Co-Founder at GRAIL
  • Ruslan Medzhitov, Sterling Professor of Immunobiology; Investigator, Howard Hughes Medical Institute
  • Jennifer Listgarten, Professor of Electrical Engineering and Computer Science at University of California, Berkeley; Chan Zuckerberg Investigator
  • Jeff Huber, Vice Chair, GRAIL (founding CEO, GRAIL; former SVP, Google, former board member, Illumina)

Open for business: partnership, collaboration and hiring

We are actively developing several categories of companies right now. Examples include therapeutics companies that are grounded in primary human data (genetics and other -omics) and which employ novel laboratory data generation capabilities to precisely understand subsets of complex disease. We are also interested in building infrastructure that would systematically accelerate the clinical development phases for such companies, particularly those engaged in pragmatic trials incorporating real world evidence. Finally, we are exploring how increasingly accurate measurements of individual risk, through genetic and other instruments, will impact care delivery.

Foresite Labs is also hiring and seeking new strategic partnerships and collaboration. In particular, we are hiring aggressively in Boston, San Francisco, New York, and will soon be opening offices for Labs or its companies in New York, Montreal and Philadelphia. We are searching for data scientists, chemists, laboratory scientists, statistical geneticists, clinical scientists, biostatisticians, and software engineers who share our commitment.

Foresite Labs,



Dr. Vikram (Vik) Bajaj
Foresite Labs Notebook

Managing Director, Foresite Capital Management; Co-Founder and Managing Director, Foresite Labs; Professor (adjunct), Stanford School of Medicine.