Protein folding is one of the oldest and most extensively studied problems in biophysics. It aims to understand the process by which a disordered, extended chain of protein building blocks (amino acids) rearranges into a stable, compact, and characteristic three dimensional structure.
The so-called Levinthal paradox asserts that the folding process is not random. If all intermediate folding conformations were equally probable, protein folding would take longer than the estimated age of the universe, even for a small peptide. However, experimentally, we know that protein folding occurs on the order of seconds or faster. So, what’s going on?
“We feel that protein folding is speeded and guided by the rapid formation of local interactions which then determine the further folding of the peptide.” — Cyrus Levinthal, How to fold graciously
The local interactions Levinthal refers to involve the interaction of the protein with itself, as well as the protein with the surrounding water. Temperature causes the disordered protein to wiggle around. Quickly, a favorable local interaction is found. This introduces stability, constraining the conformational search space, driving the protein closer to the folded state. This process repeats until the protein is completely folded.
Framing of this topic is often done in terms of “states” and “energy”. A “state” is simply a label for a set of similar protein conformations. “Energy” refers to the relative stability of a given “state”. Using these concepts, protein folding can be represented by the schematic below. The folded state is given by F, and the unfolded state is given by U. The red detail of the U basin represents unfolded intermediate conformations. Often there is more detail to the U state as the extended protein forms various metastable local arrangements. This is in contrast to the F state, which is typically very defined. Note that the structure of this landscape will change if the temperature is changed. When we study protein folding, we aim to understand the structure of the U state, as well as the most probable path from U to F.
Fig 1. Schematic of protein folding landscape. The folded, “F”, state is low in energy and narrow, while the unfolded, “U” state is higher in energy and has metastable structure (red). We aim to understand the conformational rearrangements and most likely pathway as the protein moves from U to F.
In order to study the folding process, it would be convenient to have a microscope capable of watching the protein as it folds. Unfortunately that is not possible with modern technology. Instead, we can turn to computer simulations. The method of molecular dynamics (MD) affords spatial and temporal resolution fine enough to examine the folding process in detail. Once the simulation has finished, the analysis framework of Markov models allows the examination of MD simulations in the context of states and the transition probabilities between these states. Below we report our application of these techniques to study the folding mechanism of the CLN025 miniprotein. In this work, we have detailed the structure of the unfolded state and identified the conformational pathway and bottleneck (also referred to as “rate-limiting”) characterizing the transition from unfolded to folded. Through this effort, we hope to have helped “fill in” the unresolved portions of the experimental folding mechanism.
About the CLN025 miniprotein
CLN025 is a 10 residue miniprotein, designed to fold quickly from an extended structure into an extremely stable beta-hairpin at room temperature. Beta-hairpins are a common protein secondary structure motif. Studying beta-hairpin formation in CLN025 can therefore lend insight into the folding of more complex systems. Below we depict this system in both its unfolded (extended) state, as well as in the folded state. The arrows in the folded state indicates a formed beta-hairpin.
Fig 2. CLN025 in a unfolded (top) and folded conformations (bottom, two views). Amino acid building blocks are labeled along the structures, and hydrogen bonds are shown in purple. We see that the unfolded state is disordered, while the folded state is very compact and stabilized by several hydrogen bonds.
Stability refers to the energy, or probability, of a particular state. The lower the energy, higher the stability, and higher the probability of finding the protein in that state. The probability of the F vs. U state varies as a function of temperature, and can be visualized via a “melting curve”. The melting curve for CLN025 shows that at room temperature, CLN025 will be folded with nearly 100% probability. As temperature increases, the probability of the unfolded state increases. When the temperature is 340 K (about 150 degrees Fahrenheit), there is equal probability of being in either the F or U state. This temperature is referred to as the melting temperature. At this temperature, the protein will stochastically fluctuate between folded and unfolded. Our study was performed at the melting temperature so that we could observe many instances of folding and unfolding. We were able to analyze many different folding instances and therefore gather robust statistics regarding the atomic interactions and rearrangements involved in folding.
Fig 3. The melting curve of CLN025, by Davis, et al, illustrates how the population of the folded state varies as a function of temperature. Highlighted in red is the melting temperature, 340 K. The melting temperature is the temperature at which the folded and unfolded states are equally probable (and also the temperature at which this study was performed).
Historical developments in the theory of beta-hairpin formation
The characteristic three dimensional structure of CLN025, or the “folded state,” consists of a hairpin-type shape which has a turn in the middle and a structure called a “beta sheet” on either side. This type of structure is therefore called a “beta-hairpin.” CLN025 is very small, but beta hairpins are found in many larger and more complicated systems, so it is useful to understand how they are formed. In the late 1980s, a simple model was proposed for how a beta-hairpin forms from an extended structure: first, the turn forms in the middle of the protein. Second, starting from the turn, the local contacts between the beta sheet are formed like a zipper. However, a contradictory model was quickly proposed: in the competing model, the extended structure first finds stable local interactions and becomes more compact. Then, from this compact state, the turn forms. There was also disagreement about which step of the process was the slowest, or rate-determining step. Many research groups using both MD methods like us, or experimental alternatives, disagreed about what forms first (the turn or the compact structure) and, separately, what part happens the slowest, so there were many conflicting theories about how beta-hairpins form.
Fig 4. Diagram of two contradictory models of beta hairpin formation. On the left side, the turn forms first from the extended structure and then the protein folds. On the right side, the extended structure collapses and then the turn is formed within the collapsed state. The relative speed of each step is also debated.
In 2012, an important paper was published that experimentally measured the folding of CLN025. To do this, specific positions on the protein were carefully chosen for their involvement in the turning and collapsing processes. For each experiment, one position was monitored and the temperature was rapidly increased, which caused the protein to become extended again. Then, the amount of time it took for that specific position to regain its favorable state (“relax”) was measured. Below a certain temperature, both positions relaxed in the same amount of time. This means that collapse and turn occur simultaneously, and folding can be described using a simple two state system. This is as if the schematic in figure 1 had very little detail in the U region of the landscape (the red detailed portion would be very smooth). However, above that temperature, the positions behaved differently. This experiment showed the slowest step is forming the turn, and also suggested that the collapse part happens first. Our study took place at a temperature higher than the red line. This suggests that we should witness collapse before turn formation, meaning that we should find structure in the unfolded state due to local interactions.
Fig 5. Summary inspired by the 2012 study of Davis, et al. Two positions are monitored to investigate how fast they relax after a temperature jump. Below a certain temperature, one process is observed for both positions. Above that temperature, the processes diverge. These processes correspond to the turn formation and collapse of CLN025.
In our new paper, we simulate and statistically summarize the primary folding pathway of CLN025 at the melting temperature. Our study matches the experimental measurements, demonstrating that that the collapse process happens before the turn formation, and that the turn forming is the slowest part. We also find that at this temperature, the folded and unfolded states are equally probable. Additionally, we enrich this mechanism dynamically. Our statistical summary allows the enrichment of this mechanism through the construction of a precise folding landscape (movie, right), revealing the detail of the unfolded state. We can then sample this landscape to illustrate the major and intermediate atomic arrangements involved in folding.
We can visualize a summary of our results using a representative trajectory. We can track and summarize the dynamics of this trajectory using the computed energy landscape (movie, right). This is shown below. Note that hydrogen bonds are shown in purple. The protein was simulated in water, but these have been removed so that the protein conformation is easier to see. We see there is a folded state (F), an extended unfolded state (extended, or U_E), as well as a collapsed unfolded state (collapsed, or U_C). The collapsed state displays a topology similar to the folded state, but is more relaxed. Additionally, note that the U states are significantly populated. This is expected to be the case at the melting temperature. In this movie, we see that the protein starts fully extended. Then, hydrogen bonds begin to form between the protein residues and the protein backbone. Once enough hydrogen bonds form, the protein gains some structure, and transitions to the collapsed state. Because this is not the fully folded state, it is able to easily transition back to the extended state. Eventually, the collapsed state is once again populated, and more hydrogen bonds form. This pushes the protein into the F basin, where it assumes the folded structure.
We plan to follow this post with two technical posts explaining the methods used in this study. In our next post, we provide some guidelines and methods for designing simulation experiments that correspond to physical experiments. The post following that will explain how the atomistic interaction model used to run these simulations impacts the resultant folding mechanism.
JCP paper: dx.doi.org/10.1063/1.4993207