The devil in the closet

April 2, 2019, Seattle, WA

Sometimes in science, a seemingly straightforward journey can take an enormous amount of time. Our paper in PLoS Biology (Hart et al., 2019) was one such journey. The question seemed easy enough: for a highly simplified microbial community — a community of two yeast strains engineered to help each other or “cooperate”, could we predict how fast the community might grow?

If you think that this question is esoteric, it is not. Cooperation is surprisingly common in biology: pathogenic bacteria cooperate with each other to launch infections; microbes in sewage treatment sludge cooperate to break down wastes. The faster a community can grow, the more likely it will survive perturbations or advance to new territories. Ultimately, a quantitative understanding of microbial communities will empower us to control and use communities as, for example, probiotics.

One remarkable aspect of this work — the very long and turbulent gestation — is invisible from the data themselves. When responding to journal reviewers’ critiques, I had the urge to write down this untold story of scientific discovery.

A humble dream

The project started when I was a postdoc about 17 years ago. To tie biology with mathematics, I joined a physicist’s lab at the Rockefeller University in New York City. I wanted to see how interacting “parts” of a biological system might generate quantitative properties of the system as a “whole”.

All biological systems consist of parts. For example, an ecological community consists of interacting species, and the human body consists of different cell types. A quantitative understanding of how a biological system works can be very powerful. For example, it could help us predict what would happen if we were to perturb a part.

A mathematical model consists of one or more equations. An equation describes how different quantities are linked to each other. For example, how fast population size changes equals how fast new members are added through birth, minus how fast the existing members die. The growth and death rates are examples of model parameters.

Back then, people had been modeling biological systems such as ecological communities, gene regulatory networks, and the cell division cycle. Some models matched data beautifully. However, the renowned mathematician and computer scientist John von Neumann once stated “with four parameters I can fit an elephant, and with five I can make him wiggle his trunk.” In other words, given enough “free parameters” — parameters one could freely choose rather than being constrained by reality from experimental measurements, a model can be made to fit any data. Although a fitting model can explain data, it does not mean that the model is correct or can predict new data.

To avoid the “free parameter” problem, I decided to start with a very simple system. In such a system, I should know exactly how parts interact with each other. I could then write down the equations, know which parameters need to be measured, and measure all parameters. After much deliberation, I decided to engineer a highly simplified cooperative yeast community consisting of two strains, each supplying the other with an essential metabolite. My colleagues and I thought of a lovely name for it: CoSMO — Cooperation that is Synthetic and Mutually Obligatory. Unlike real-life communities where scientists often have trouble counting the number of species, CoSMO has two and only two strains. Unlike real-life communities where species influence each other by releasing many uncharacterized chemicals, in CoSMO each strain releases only one metabolite which is required and consumed by the partner. Moreover in CoSMO, the two strains coexist due to their inter-dependence, and thus I do not need to worry about losing any one of them.

Now that I know how the two strains interact with each other, it should in principle be easy to predict community properties, such as how fast the community grows — or community growth rate. Community growth rate primarily depends on two traits of each strain: the metabolite release rate and the amount of metabolite consumed per birth. Thus, modeling community growth rate boils down to measuring four parameters. This is by no means ambitious!

Mission aborted

I measured all four parameters. I measured the metabolite release and consumption traits of each strain in the absence of its partner. I got rid of the partner so that the released metabolites would accumulate in the test tube for me to measure, instead of being immediately consumed by the partner. However, since the partner was not present, the measurement environment (called the “batch culture” environment) differed from the community environment. For example, to measure metabolite consumption in batch cultures, I would add a high dose of metabolite at the beginning of an experiment. In contrast, in communities, the consumed metabolite is constantly supplied by the partner at a low level. Measuring strain traits in a community-like environment would require a special experimental setup. So, I had to hope that the batch culture environment could approximate the community environment.

After measuring the four parameters, I predicted community growth rate. However, my prediction was way off from the experimental results. I was disappointed. In theory, a mathematical model is useful when it fails, because failure suggests that we are still missing important pieces. In reality, I was far from being thrilled by the failure, because too many pieces could be missing, even in a system as simple as CoSMO. For example, the batch culture environment might not approximate the community environment; cells could be evolving… The problem immediately becomes monstrously messy and un-elegant — a devil.

Eventually, I aborted the mission. I was forced to ask whether I could explain some other community property — the minimal total cell density required for the community to start to grow. That calculation required measuring more parameters — such as cell growth rates at various metabolite concentrations. I did not have the experimental setup for such measurements, so I gave in to free parameters. I did what was, and is still, commonly done: looking for literature values. The literature values varied over an order of magnitude, so I naturally chose the value that could explain my data. I felt guilty, but comforted myself by noting that at least, the free parameter I chose was not outrageous, and that at least, the fraction of free parameters in my model was far lower than most other models. This reasoning did not exonerate me, but helped to bring a closure to my postdoc project (Shou et al., 2007).

Haunted by the devil

When I started my lab at the Hutch, I promptly locked up the devil in a closet. I felt caught: on the one hand, grant reviewers kept punishing me because “CoSMO is too simple”, yet on the other hand, I could not even understand a very basic property of the community. My modeling failure was indeed humiliating. There was no way I could rephrase the question in an exciting fashion to attract any one, possibly even including myself.

My group started working on other more sexy problems, such as how the two cooperating strains might fend off cheaters who consume but do not contribute metabolites (Waite and Shou, 2012; Momeni et al., 2013a).

Despite group members’ successes, the devil kept haunting me. Babak Momeni, then a postdoctoral fellow in my lab, examined spatial patterning in CoSMO when CoSMO grew on an agarose pad. When we compared patterns predicted by our model versus patterns observed in experiments, they looked similar in a qualitative sense. However, the timing looked very different. This is not surprising given that we do not understand how fast the community grows. Fortunately, dynamics was not the focus of that paper, so we erased all time stamps from our simulations (Momeni et al., 2013b).

CoSMO patterning. The two cooperating strains were engineered to express green or red fluorescent proteins, and can thus be distinguished under a microscope. Time stamp was shown for experiment (right) and not simulation (left).

Years later, Arne Traulsen at the Max Planck Institute for Evolutionary Biology in Germany would comment: “We all think that mathematical modeling of biology is hard, but here comes Wenying Shou and she shows us that she can do it.” And I would embarrassingly confess, “No, Arne — we have so far only managed the qualitative part. We do not even understand how fast CoSMO grows…”

Devil breaking loose

Eventually, the devil of my past failure would not allow me to ignore it any further.

Chi-Chun Chen and Jose Pineda, two talented group members, were quantifying metabolite release rates of evolved cells. They wanted to see whether cells could evolve to be more “generous” by releasing more. However, Chi-Chun and Jose were getting highly variable results despite their superb experimental skills. It seemed that we got stuck when the question turned quantitative.

We suspected that the variable measurement results could be due to cell traits being highly sensitive to the measurement environment. To enable measurements in a community-like environment, David Skelding — a physicist in the lab — started to build devices called “chemostats”. In chemostats, nutrients were supplied at a small dose (in small drops) but frequently (every tens of seconds), mimicking partner strain’s slow but constant metabolite release rate. It took David a good year or more to ensure that chemostats worked reliably and precisely (Skelding et al., 2018).

Chemostats. This home-made multi-plexed chemostat has eight culturing chambers (tubes with yellow stoppers). The syringe pump on the left pushes the fresh medium into chambers through tubing. Sterile humidified air was also introduced into the chambers to push out excess waste.

Taming the devil

When Sam Hart joined my lab as a research technician, he inherited the problems I, Chi-Chun, and Jose had left behind. Sam had already done four years of undergraduate research at the University of Vermont, so he quickly picked up experimental skills from Jose. Sam was also an athlete, a member of Seattle Sockeye Ultimate Frisbee Club. He had been trained to handle a lot of setbacks.

Initially, Sam continued to quantify strain traits in batch cultures, because after all, chemostat measurements are much harder and are limited by the number of chambers. However, at some point, we realized that without getting our fundamentals on a solid footing, we would be chasing after our tails: If we do not understand the two ancestral strains (i.e. why ancestral strains’ traits cannot explain ancestral community’s growth rate), there is no point trying to understand evolved strains.

By that time, Sam had already invested a year or two. But Sam was unflustered because he understood the importance of asking the right, albeit inconvenient, question. Sam re-measured ancestral strains’ metabolite release and consumption traits in David’s chemostats. By controlling how slowly metabolites were supplied, Sam could force cells to grow at various slow rates observed in CoSMO. However, chemostats introduced their own devil: because of metabolite limitation, cells from both strains quickly evolved away from their original states while adapting to metabolite limitation. Sam then figured out ways to deal with this new problem.

Ancestral versus evolved clones. On agarose with low metabolite, ancestral cells failed to divide (arrows). Cells from a mildly-adapted evolved clone (center) showed mixed phenomena: some cells remained undivided (arrow), while other cells formed microcolonies of various sizes. Cells from a strongly-adapted evolved clone formed microcolonies of a uniform and large size. These images were taken using a cell phone camera and thus do not have a scale bar. For reference, an average yeast cell (e.g. black dots in “anc”) has a diameter of ~5 µm.

Eventually, Sam discovered that indeed, measurements of metabolite release and consumption traits could differ significantly in chemostats versus in batch cultures. Hanbing Mi, an undergraduate visiting student from China, figured out how to properly measure community growth rate when cells could evolve quickly. Once we took all these into consideration, we solved the puzzle (Hart et al., 2019). But only partially: we still do not understand CoSMO’s initial phase of slower growth.

Model can explain experimental observations of CoSMO long-term growth rate. Model prediction explained experiments (purple) when parameters were measured in community-like chemostat environments (green), and not when parameters were measured in batch culture environments (blue). Error bars mark 95% confidence interval.

Using the same quantification methodology that we have found to be trustworthy, Sam figured out what it means to be “generous” (accepted, (Hart & Pineda et al., 2019)), and which mutants evolved to be more generous (manuscript in preparation). Sam is now a graduate student at the University of Washington.

Summary

It takes a lot to do careful science. For science to advance, it must stand on a solid foundation. By demonstrating how to properly model a very simple living system, we have helped setting the standard for future modeling of more complex systems such as probiotic communities or infectious diseases.

Acknowledgements

I am very grateful to my lab members, especially Sam Hart, Jose Pineda, Chi-Chun Chen, Hanbing Mi, and David Skelding for doing high-quality work. I thank Alex Yuan for introducing medium.com to me. Alex Yuan, David Skelding, Riley Kimsey, Robin Wilcox, George Moore, and Aziz Alfi provided feedback on writing. I also thank Yiting Lim (a writer for Hutch Science Spotlight) and Gary Gilliland (President of Hutch, an advocate of outreach) for providing additional impetus for this writing.

Notes

  1. Underlines contain clickable links to research articles.
  2. For my other stories, see https://medium.com/@wenying.shou.

Postscripts

  1. I read this story in Chinese for mother, and she said, “I never knew what you were doing, but now I have some understanding!”
  2. I shared my story with Prof. Bruce Telzer, my former Cell Biology Professor at Pomona College. He wrote:

One aspect of the story struck me… how you view your initial experiments as “failures,” when from my perspective they were no such thing.

Thomas Edison once famously said, “I have not failed. I’ve found 10,000 ways that won’t work.” … I’m reminded of how much time I would spend in lab trying to get an experiment to “work,” all the while thinking that I was getting nowhere. However, during each apparent failure I successfully tweaked the pH, time, temperature, and reagent concentrations, etc., until finally on the 10,001st attempt, it worked. Edison was right.

One of the hardest tasks I faced as a professor, particularly in the intro bio courses, was dealing with the “my experiment didn’t work” complaint of students because they didn’t get their expected results. I would talk myself blue, trying to convince them that the experiment always “works” in terms of providing them meaningful data in order to plan the next experiment and then the next. Some of them eventually got that this is how science is done. Many, however, still left lab thinking they failed, and this always saddened me.