Origin of Life: Leading Theories & Developments
A summary of what I learned from the Santa Fe Institute’s Complexity Explorer online course
At some point of our lives, we’ve all asked “what is life and how did it originate?”. This particular question is a really difficult one. One might think that after decades of research we would have more answers than questions. But that is not the case today. We have never been more perplexed about the complexity and richness of living systems.
However, this is a great sign! Whenever we’ve answered some deep and fundamental question in science, we’ve unlocked a new larger set of questions. So we are heading in the right direction.
Science is usually imagined as layers of an onion. We start at the center and move outward expanding. Some of the smartest people on Earth have been making progress in unraveling the mystery around life. Some images and videos in the article should make it clear how far we’ve come.
I have a degree in chemical engineering, but I wished I learned more about the earliest living factories on Earth, the cell. Now that I work more closely within complexity sciences, I opted for the interdisciplinary course offered by the Santa Fe Institute called Origin of Life.
The course is a perfect blend of theory and experiments all backed by recent landmark publications.
To formalize the knowledge I gained in this course, I decided to prepare a condensed summary of what I learned from this course.
This article is not peer reviewed by the course instructors (though I would like for it to be). It is possible that I may have made mistakes when recollecting content from memory. Wherever possible I have done my due diligence with referring back to the slides, but if there are still some mistakes then please forgive me for the errors. I’ll gladly fix it if you reach out to me on my email: connect@rayyanzahid.com.
Furthermore, I haven’t referenced any of the content here. You can find all the references from the Origin of Life course on Complexity Explorer. More about it at the end of the article.
Contents
- The Tree of Life
- What do we know about LUCA?
- Early investigations
- Location
- Chemical Reactions and Substrates
- Energy
- Complex Functions from Simple Origins
- Life from Simple Ingredients
- Evolution
- Complex Life
The Tree of Life
All life on Earth uses a finite set of molecules to perform a wide variety of cellular functions. We can generate phylogenetic trees by sequencing the genes found in different organisms and determine the relationships between them. It turns out that similar sets of organisms will often share the same sets of genes. We can thus trace the lineage of organisms and find common ancestors to see how they evolved.
Scientists have spent countless hours sequencing genes of different organisms. This is what we’ve learned from it:
All life on Earth is separated into three main branches:
- The Archaes (prokaryotes)
- The Bacteria
- The Eukaryotes (e.g. animal, plant and fungi cells).
And they all stem from the same root.
The phylogenetic tree of life shows that the eukaryotic cells evolved only once. On closer look, some early cells might have engulfed a prokaryotic cell that established an endosymbiotic relationship with its host. This created the first eukaryotic cell. Mitochondria and chloroplasts are examples of this assimilation of the prokaryotic cell. These new cells are distinctively capable of multicellularity.
But where did these three branches evolve from? It is inferred from the tree that at some point in our history the branches diverged from a single common ancestor. This ancestor is known as the Last Universal Common Ancestor (LUCA) — the population of cells from which all life stemmed from. This singularity is the limit on how far in the past we can infer the data about life on Earth.
What do we know about LUCA?
It would’ve been distinctively a homeostatic cell. It would’ve had some set of metabolic processes that maintained the internal cell environment. These processes would have facilitated cell growth, reproduction and some mechanism of passing on hereditary instructions. Without these processes, the cell would’ve been unable to reproduce successfully and would’ve gone extinct.
LUCA would have also needed specialized molecules for the regulation of its internal machinery. It is evident that genes, transfer RNAs, translation ribosomes, bilayer membrane and membrane channels all would have been essential. In addition, energy molecules like the chiral D-sugar and substrates like the chiral L-amino acids would have been needed for the proper functioning of the cell.
Knowing all this still doesn’t solve the mystery of how life originated. LUCA is still too complex to originate instantaneously. However, it does make it more manageable to study the origin of life. We now have a single endpoint for abiogenesis.
Here is the real million dollar question. How did our LUCA emerge from the available materials on early Earth?
Early Investigations
Knowing how complicated LUCA is, the origin of life needs a more advanced pre-biotic explanation.
The Miller-Urey experiment (1952) was designed to find that pre-biotic explanation of life. The experiment used simple elements that were abundant during the early years of Earth, and then they were made to cycle through a set of energetic processes. It was observed that after some time seemingly simple elements reacted together to naturally produce compounds that we know are a part of LUCAs biochemistry.
Eureka?! Case closed?
Not quite. In the same way you cannot create chickens from chicken soup, you cannot create life by just having some complex molecules in a jar. It doesn’t help either that the mixture of all the molecules in the experiment often end up mixing and reacting further to form a black tar soup.
So a much more sophisticated theory is needed. Understanding abiogenesis requires:
- A thorough analysis of chemical reactions
- An understanding of the constraints on the chemical processes observed in nature.
Deeper studies work with the constraints of location, energy sources and reactants in Earths timeline. There are salient dynamics in Earth sciences, physics and biology that allow us to understand how natural processes can accomplish the sophistication needed in abiotic processes to create life.
Location
History of Earth can be traced as far back as ~4.5 billion years. The Hadean age was when Earth was a still ball of magma. At around ~4.1 billion years, as soon as it got cool enough, Earth started forming a habitable floating crust.
The NICE model and lunar space rock samples retrieved revealed some important facts about Earth’s early turbulent history. Asteroids carrying carbonaceous chondrites, organic amines and water regularly bombarded Earth for a sustained period of Earth’s early history. Some of these compounds found in the asteroids are only uniquely formed in the stellar environment of space. The Late Heavy Bombardment (LHB) of asteroids added to the chemical diversity of our planet. New research has revealed that the LHB sustained for a longer duration than previously thought.
The climate eventually cooled down and Earth entered into the Archean age, becoming a much more hospitable jar for an Earth sized Miller-Urey experiment. Our Sun was still ~30% dimmer and Earth sat right on the inner edge of the classical habitable zone. The amount of energy reaching Earth would have been ample for surface reactions.
Studying the habitable zone of other solar systems also allow scientists to find Earth like planets and so work with a wider limit of planetary conditions. When compared to our solar system, some stars even have a larger and a more optimistic size for their habitable zone.
We’ve found life in just about any environment on Earth. Extremophiles can withstand negative pH, extreme salinities and high temperatures. Not even the cold, dry, radioactive vacuum of space has been a challenge for some life on Earth, such as the Tardigrades. Cuarto Cienegas in Mexico is a great place to study the past. It still has biomes preserved from 3 billions years ago.
With these discoveries, we’re more optimistic that life could have evolved very early in Earth’s history and possibly elsewhere in the Universe as well.
Life also doesn’t have to look like the one we have today. Genetic modification and other lab based experiments have shown that life-like features can exist even without membranes and cells. Some experiments have used non-polar and even non-liquid solvents.
Understanding the conditions of other planets in our Universe also helps understand our own. Scientists use direct and indirect imaging (Doppler Wobble effect) to find exoplanets and study their atmospheres using secondary eclipses and phase curves.
All in all, the location matters a lot. Earth has undergone a lot of changes in the past few billions of years. It is very difficult to study the conditions of the planet billions of years in the past. But space sciences and Earth sciences both have helped paint a better understanding of the geochemical and geophysical conditions of our planet.
Given that scientists now know more about the environment, they can study chemical reactions with better precision.
Chemical Reactions and Substrates
A reaction is favored when the Gibbs free energy of a reaction is positive. The free energy is dependent on the enthalpy difference between reactants and products, and the change in entropy caused by the reaction. The activation energy required to start the reaction determines the speed of the reaction. In the presence of catalysts, the activation energy is lowered thus improving the rate.
It is critical that the first protocells that lead up to LUCA would’ve had reactions that produced stable compounds. More specifically reactions that produced thermodynamically stable polymers with a kinetically stable assembly of their tertiary structure. It is important for any protocell to have lasting stable molecules that persist over time.
For example the Prion disease is an illness that is difficult to cure. It is like this because the products of the disease are thermodynamically and kinetically stable, which is difficult to remove or destroy.
Scientists have found ways to find evidence of ancient reactions necessary to sustain life in fossilized rocks. Stable isotopes of Carbon (δ13C) and Oxygen (δ18O) do not naturally decay and, so, are essential ingredients to organic life. Because of the kinetic isotopic effect and the equilibrium isotopic effect, reactions involving these isotopes form distinct signatures in fossilized rocks.
Living organisms make use of whatever resources are available on the surface of the Earth. Carbon, Hydrogen, Nitrogen, Oxygen, Phosphorous and Sulphur all form organic compounds. It is no coincidence that compared to other elements, life uses elements that are relatively more abundant in the Universe and on Earth. However, in the past, the abundance of these elements would have varied.
Oxygen, for example, was a scarce resource in the first 2 billion years of Earth’s history. Oxygen was used faster than it was being produced. When photoautotrophs emerged, oxygen became more abundant. The availability of excess oxygen helped change the surface chemistry to that of which we see on Earth today.
Thus, prebiotic life would have emerged in an oxygen deficient environment where it used other more abundant gases like methane.
Elements that have properties favorable to living organic life are as follows:
Phosphates: A relatively scarce resource, phosphates are essential in providing structural support to the nucleic acid that makes DNA and RNA. They are also essential for stable phospholipids that form the protective membrane around the cell. Due to the electronegative charge of the membrane, water and other cells can be repelled and therefore create compartmentalization within the cells. The charge also acts as a natural signal for communication. Phosphates, due to its high binding energy, store plenty of chemical energy and help in lowering the activation energy of other molecules.
Carbon: A unique element that has the ability to form four stable bonds with a wide variety of elements. Carbon molecules can create shapes in 3-dimensions which makes it very versatile for building blocks. Silicon is abundant like carbon and forms four stable bonds as well. But silicon creates difficult to handle products from reactions with certain elements like oxygen. Boron, metal oxides, ammonia, urea and formamide also don’t compare well with carbon. But these compounds do have specific uses that can be found in the cells today.
Surface evolvers offer a simpler explanation of how life arose on early Earth. In the absence of oxygen, methanogenesis would have supplied enough energy for the protocells to function. Unoxidized flat iron surfaces under the sea would have acted as sites where reactions could have been sustained. And with plenty of volcanic activity under the sea, the essential elements of life would have all been present in one place. Coupled with the physical dynamics of reactions, simple living systems may have been possible in these conditions.
Water: It is a polar molecule that creates kinetic stability in the tertiary structure of macromolecules. Enzymes like proteins need to have a specific structure to perform their function. Water’s polarity enables the molecule to form hydrogen bonds with other molecules. The structure of DNA is highly dependent on this type of bond.
Hydrophobicity is also a consequence of this polarity which allows phospholipids to form bilayers or miscelle membranes. The relative abundance of H₂ and the stability of H₂O makes water a very important component of early life.
Sugars: Made up of carbon and hydrogen molecules, sugars store energy and act as building blocks for the cell. It is formed through the autocalaytic Formose reaction between carbon and hydrogen molecules. Autocatalytic reactions of sugar can self-catalyze to form a complete family of various carbon polymers essential for life. Even with tautomerization the 5-carbon sugar, the simplest cyclic chain of carbon, can have all the essential reactants for life to occur.
Lipids: These form fats. Phospholipids, glycolipids and archael ether lipids are made up of glycerol, sugar and fatty acid tail. The electrostatic charge of the head and the tail of a lipid molecule determines the kind of membrane that is formed. A membrane allows cells to create an independent environment where autocatalytic reactions like the Formose reaction can take place. Early cells could have used single tailed lipids but would have had generally a lower stability.
The Central Dogma of Biology and the reactants:
The Central Dogma is a very simple concept. DNA is the blueprint of life that replicates itself using DNA Polymerase to ensure its existence. A piece of this blueprint is taken by a messenger RNA to the Ribosome where transfer RNAs translates the DNA’s blueprint to actual proteins.
DNA, RNA and nucleic acids: Deoxynucleicacid and Ribonucleicacid all contain nucleic acids. The nucleobases are purines (two ring) or pyrimidines (three rings). The nucleobases allow hydrogen bonds to form between adjacent bases. 2-H or 3-H bonds (hydrogren bonds) determine the pairing between nucleotides. This is also known as the Watson-Crick model. A nucleotide is made up of a phosphate, a ribose sugar and a base. Polymerases help catalyze reactions.
Proteins: Proteins are made from charged or uncharged amino acids, that condense to polymerize into long chains with specific structures. These structures help lower transition state energy and orient and concentrate reactants.
Energy
Storms, plate tectonics, radiation, thermal vents, and ocean currents are all viable sources of energy for early life. The energy would have been used to provide the activation energy required to power the first reactions.
Inside the cells, from our phylogeny, we understand that cells use chemical or light energy to get function. Since we didn’t have O2 early in Earth’s history, chemoautotrophs probably used sulfur and hydrogen. Because of the abundant carbon, photoautotrophs would be able to reduce CO2 for their oxygen or use methanogenesis to fix carbon while gaining energy. Modern redox reactions are more complicated. Primordial life in the absence of oxygen could have used Iron as a catalyst.
Energy within cellular environment require a balance of different conditions. This balance is maintained by the flow of energy.
Energy can be harvested on the surface of membranes through redox reactions. The reactions within the redox can be coupled to meet the activation energy required by another reaction. This is achieved through charge separation which allows chemiosmosis to take place. The proton photo-electron redox loop and the proton pumps are used in tandem by the cell to make cellular reactions possible.
Complex Functions from Simple Origins
At the level of intra-cell behavior if you look at living cells, they form very definitive and sometimes complex macro and microstructures. There are explanations to this phenomena. Pattern formations are found and can be visualized in seemingly simple chemical systems and their physical dynamics. The reaction diffusion equation is one such example.
A simple system such as the Brusselator have very complicated dynamics. These systems are very sensitive to change. One can often get very different results by marginally varying conditions. The limit cycles within cellular function is part of chaotic dynamics.
Physical dynamics limit the reactants’ functionality. The macro-behaviour of the cell is affected by the metabolic rates and the diffusion of reactants.
At the end, for any life to originate, seemingly complex reactants need to form from very limited and often simpler substrates. Autocatalytic sets form mutual catalysts that provide a way for us to understand how we can get molecules that perform complex tasks. Autocatalytics sets produce series of compounds that perpetuate the initial reaction. Formose reaction is one such example that we previously mentioned for sugar. Formation of RNA in the cell is speculated to be another kind of this reaction.
At the macromolecular level, these reaction systems also set the limits on the size and structure of any system of molecules.
There is a challenge in predicting how these reactions must have taken place in the past. Lorenz system is a chaotic system that is deterministic but gives wildly different results with the slight variation in the initial conditions. In the absence of perfect knowledge about the initial conditions the final state of the system is almost unbeknownst to us.
In vitro reactions help get a better grip on the initial conditions of the reaction. DNA is an incredibly sophisticated and complicated molecule. There is a popular theory called The RNA World that tries to explain how life may have sustained using a simpler genetic system. RNA, as we’ve mentioned can emerge from autocatalytic sets. This means that RNA can replicate without the sophisticated enzymes usually found in DNA replication. RNA is more susceptible to degradation and denaturation but is still a more likely candidate for early life. RNA is studied through in vitro evolution.
“The RNA World” theory is a promising one because the autocatalytic origin of RNA seems to be very viable. DNA on the other hand require complicated machinery to keep the mutation rate at an astounding 1^-10 bases. DNA has 4 simple bases. With modifications like methylation, 44 modified bases in DNA and 112 bases for RNA can exist. These modified bases are sometimes functional and sometimes mutational. Enzyme activity, damage and misincorporation generate these errors.
Ribosomal RNA contains within it the historical record of early life. The molecule is built in layers that when peeled back reveal its more limited and primitive version of the ribosome. There is a stronger reason to believe ribosomes are prebiotic.
As for DNA, how does it maintain only a 1 in 10 billion mutation rate, which is magnitudes better than any human created process? Life today uses base excision repair, direct reversal of replication, DNA mismatch repair and other built in mechanisms to fix the DNA. Some of these mechanisms are integrated in DNA polymerase and in kinetic proofreading.
Life from Simple Ingredients
After a thorough investigation of the theories, we are more confident in the types of molecules and functions that LUCA would have had. Genes, RNA, membranes and energy substrates would have been present. A simple metabolism with the ability to fix carbon (in the absence of oxygen) would have been present in early life.
Earlier versions of life, such as the protocells, would have used chemical gradients, electron transfer and catalytic sets to build their metabolism.
The Spiegelman’s experiment was an important experiment that displayed how biological systems evolve to accumulate features and traits that are scarce in the environment. If there is an abundant presence of functional substrates outside a cell, the cell will lose the ability to produce those substrates.
Why do we always find RNA and DNA in the cell? The simple answer it seems is that the environment does not have all the mechanisms to help the cell reproduce.
Evolution
The formation of protocells into LUCA and further into the life we know today can be understood in terms of metabolism, genetic Transmission and evolution.
Darwin was the first person to deduce a pattern in living creatures. Today the most popular recipe of adaptation starts with; Replication then mutation and finally selection. Surface evolvers are a good place to start with as they may not have had cells or genes and would thus could be the base case.
During replication, mutations such as the insertion, deletion, inversion, recombination and migration of genes are subject to drift and selection. Drift is when specific sequences of genes go extinct and selection is when the sequence dominates over a fitness landscape and thus persists. It is to be noted that no causal relationship between fitness and complexity has yet been established.
For the complete organism, variation, inheritance and selection affects its existence.
This Darwinian evolution can be simulated using either the Quasispecies model of self-replicating entities at a high mutation rate or the Price equation. Evolutionary benefits are charted over a fitness landscape. When analyzing steady state solutions, there is a minimum error threshold. After the threshold is crossed the specie undergoes error catastrophe. The specie then can either lose the ability to replicate or lose its own identity entirely.
Selection does not mean that the trait will persist. Like the Gambler’s Ruin, under moderate selection pressure, the fitter gene because of other random events can also go extinct.
Complex life
Thanks in part to evolution, we now have multiple branches of complex life originating from LUCA. Eukaryotic life evolved only once to form the Last Eukaryotic Common Ancestor (LECA). This cell was most likely to be a predator or a symbiotic host. There are models that explain how this happened such as the inside-out and outside-in model.
Multicellular cells have distinct advantages in their surroundings. In their physicality, cells can create niches with regard to their spatial environment and resource availability. The pioneer species in an environment has two routes. It can succeed as a generalist in its fundamental niche or create its own realized niche for which it will be highly specialized for.
Even though life can create its own niche, there are limits to cell physiology no matter how favorable an environment is. Kleiber’s law describes the population energy use to a certain population of the species. Metabolic rate and body temperature or resting heart rate is highly correlated and similar to all. The lower and upper bounds of cell growth is also limited by the energy usage and volume of the cell and the diffusion equation.
There has been many attempts to understand how life propagates in populations. The Cellular Automata is one such system that given some basic rules that can deterministically and autonomously create complex systems. The Von Neumann self reproducing automata is an attempt to make the architecture of self-reproducing machines. However, biology is different from physics. Whereas there is fixated universality in physics, complex adaptive systems such as life create laws that change depending on the state of the system.
The naturalists’ argument to life is that life can be understood by understanding the physics and chemistry of the universe. The functionalists argue that in order to give a complete definition, it is more important to understand the properties exhibited by these physical laws.
There is one bifurcation in how physical systems and biological systems evolve. The physical systems arrow of time is determined by the increasing entropy of the system caused by natural degradation. However, the arrow of time for living systems is determined by adaptation or, the reverse of the thermodynamic arrow of time. It is argued that for the living systems, focus should be on the individual agent or, the entity.
In “The Fundamental Theory of Natural Selection”, Ronald Fisher used the “Euclidean Gradient Vector Field” to show how agents adaptively move towards optimality. It can be determined mathematically that all adaptive processes are minimizing the uncertainty about its world or maximizing the information about its world. Evolution, Baysien Inference and Reinforcement learning are all examples of agents maximizing the information about the world.
Recalling the Speigelman’s experiment, the virus pruned its genome when it found greater certainty of its internal substrates in the external environment.
This has created a third school of thought in the origin of life. The first is genetics first and the second is metabolic first. The third and alternative school of thought to explain the origin of life is life in terms of information processing systems. An example of this is calculated for DNA’s Landauer bound of thermodynamic efficiency. Reproduction is computationally only 20 times as inefficient to the natural state. A computer on the other hand is 100 million times more inefficient.
If life is an attempt at reducing uncertainty of its surrounding, the definition of what really life is may as well be more encompassing than we imagine.
As such, there maybe separate and entirely new laws to explain these self-evolving systems. We see from observing the scaling laws that how a population’s energy use can scale linearly for all species.
At first glance humans appear to be an exception to some of these laws. But the truth is that technology has had an incredible impact on the human ecology.
Is technology a living system? Where does it fit in the story of the origin of life?
Scientists correlate different measurable variables in living things to further create classifications. Energy is used for development, growth, maintenance and reproduction. Once obtained, it is processed, distributed and converted in the most optimal and efficient rate at the right temperature. Energy is often correlated with heart rates, temperature and even population density to study niches.
Understanding the origin of life on Earth can be challenging. The field of non-equilibrium physics is essential to understand how changes in the states occur over time. Phase transitions show how systems can sharply display different properties over small changes of a variable in a small narrow range.
The origin of life is still largely an open question. However, recent progress calls for a lot of optimism. The pursuit to answer this one question is making humans realize that there is more to living systems than what we previously thought. Information theory, complex adaptive systems and computational thinking are changing the ways we fundamentally think about evolutionary development. There is certainly a lot more to discover.
Complexity Explorer is a web-based platform that delivers online courses, tutorials, and resources essential to the study of complex systems. Complexity Explorer is an education project of the Santa Fe Institute.
The course Origin of Life is available from the link below:
All references for the content in this article can be found in the Origin of Life course.
More from the writer
About the writer
Rayyan is software engineer working for a startup in San Francisco. His work revolves around complex adaptive systems, systems thinking and software languages. You can mail him your thoughts at connect@rayyanzahid.com.
Rayyan has a free weekly newsletter called Elevate. Be exposed to 25+ diverse topics that he exposes himself to every week. Each piece of the newsletter is written with some background into the topic. Subscribe by clicking on this text.