Intrinsically Disordered Proteins

Part 1: What they are (not)

meh guy
Biocord
7 min readDec 30, 2021

--

Anyone who has at least some education in biology has probably been taught about proteins and their structure: that they are a chain of amino acids that bends and twists in a process called “folding” into some well defined 3D structure, of which there are entire databases such as PDB and which make for great pictures in textbooks.

ATP synthase from yeast mitochondria (PDB ID: 1QO1)

Lies. All lies.

Well, more lies-to-children and “nothing in biology actually follows hard rules no matter how much we like them”-itis than outright purposeful deception, but we still need to rethink our models. It turns out that a well defined 3D structure is not a feature of all proteins, and the ones that don’t follow this pattern got labeled as “intrinsically disordered proteins”. So let’s review the classical protein structure paradigm first, and then we will try to poke some holes in it.

Globular protein structure

As I mentioned before, proteins are built out of simple elements called amino acids linked in a long chain. There are 20 standard variants of these building blocks (and a few non-standard ones like selenomethionine).

The 20 protein amino acids (Reprinted from Campbell Biology 12th ed.)

As you can see from the picture above, all of them contain a common backbone with an amine group on the left, and a carboxylic acid group on the right, thus giving them the name. From this backbone sticks out a side chain, which is the distinguishing element of each amino acid. Special mention must go to proline, which has a side chain that loops back to the amine group, giving it some special properties which will become relevant later. Those subunits serve as links in the polypeptide chain, linked amine group to acid group by peptide bonds. The fact that the building blocks have distinguishable N-termini and C-termini give directionality to the chain, and the sequence of amino acids in it is called the primary structure. All proteins, ordered and disordered have this structure.

Primary structure of a protein (Reprinted from Campbell Biology 12th ed.)

The polypeptide chain now consists of a long backbone formed from nitrogens, alpha-carbon atoms (the ones with the side chains sticking out from them) and the carbonyl groups (carbon with double-bonded oxygen). What can happen next is that the NH and CO groups can start interacting with each other, forming local, secondary structures, such as the alpha-helix and beta-sheet.

The alpha-helix and beta-sheet (Reprinted from Voet and Voet — Biochemistry 4th edition)

And well, those NH and CO groups could have just as well formed hydrogen bonds with water, so what’s stopping them? Well, here is where the side chains of amino acids come into play. As the neat categorization in the figure a bit above tells us, some of those side chains are hydrophobic, which means that they don’t interact with water. That means that the water molecules have a direction in which they can’t turn if they “want” to make hydrogen bonds, so they have less freedom of rotation. Therefore, the water molecules form a local cagelike structure. That is unfavorable, because the decreased freedom of movement means less entropy, and Papa Thermodynamics doesn’t like that.

An illustration of the hydrophobic effect (Reprinted from Garrett and Grisham — Biochemistry 6th edition)

So what does our polypeptide do when there are such unpleasant to work with side chains? Well, it tries to hide them inside of the protein, forming a hydrophobic core. And in this hydrophobic core, the CO and NH groups of the polypeptide chain have nothing to hydrogen bond with but each other, hence, they form the recognizable secondary structures. And so we get the familiar sort of protein structure. The classical paradigm says, that the structure of the protein determines the function of the protein, which is why finding out what these structures are is so important for biochemists.

Wiggle room

In general it is assumed that proteins are more or less rigid. The pieces of the protein interact with each other through van der Waals forces, hydrogen bonds, some ionic interactions, and every now and then a disulfide bridge or two just to make sure the structure sticks together. But even globular proteins can change their shape depending on the conditions, which is of course super relevant to their function. The way the polypeptide chain twists and turns is often called its conformation, and so this process is called a conformational transition. An example of such transition would be hemoglobin slightly changing the arrangement of its subunits upon binding oxygen.

Another example of such transitions can be seen in enzyme dynamics. Enzymes are the proteins that conduct chemical reactions in your organism. They are like a little machine in a factory, often being a part of a larger assembly line which in biochem lingo we call metabolical pathways. The most basic model of their function is called the lock and key model: the substrate (the thing undergoing the reaction) fits into the protein as a key would fit into a lock, a reaction happens, and then the products go their merry way. That model however got refined into the induced fit model, or the hand and glove model, in which the protein slightly changes its shape upon binding the substrate, to accomodate it.

And then there are proteins that are all about movement, so-called molecular motors, like the actin-myosin pair that powers all our movements, or the ATP synthase that produces the basic energetic currency of the cell, ATP (they deserve their own blog feature some day). However most of the time they are viewed in terms of a few, well-defined conformations that the protein jumps between. We need to go beyond that.

IDPs, what are they then?

Now the simplest definition of intrinsically disordered proteins is just… Proteins that don’t do all that. They don’t form nice and tidy secondary and tertiary structures like other proteins do. It’s not a very formal definition, as it basically only defines the term in relation to another term, ordered proteins (waiter, there is some structuralism in my science, what are humanities doing here?). However it works well enough for basic usage by experimentalists, for whom what matters the most is that these proteins behave weirdly.

Simulated possible conformations of the Aβ peptide (Reprinted from: https://doi.org/10.1021/ACSCENTSCI.7B00626)

Someone with a more theoretical bent can try to define IDPs through thermodynamics. Basically, each of the conformational states of the protein is associated with some amount of energy, or more precisely Gibbs free energy. The less energy a state has, the more it is occupied. Globular proteins have a single, well defined minimum in their energy landscape. IDPs have either a more shallow minimum, having some basic structure with lots of possible deviations, or they don’t have any energetic minimum at all.

Comparison of different types of energy landscapes (Reprinted from: https://doi.org/10.1073/pnas.0807977105)

This definition is nice and tidy from a biophysicist’s perspective, however it has one problem: we can’t really see what the energy landscape looks like for intrinsically disordered proteins. Figuring out what the different possible states of the polypeptide chain are, and how often they’re occupied is a truly daunting task.

However the distinction between weakly funneled and rugged energy landscapes points to the fact, that not all IDPs are created equal. A 2001 paper by Dunker et al. proposed a so-called Protein Trinity:

Protein trinity (Reprinted from: https://doi.org/10.1016/s1093-3263(00)00138-8)

It proposed 3 kinds of states a protein can assume: the boring old ordered state, a molten globule state, in which there is some structure, but with a lot of flexibility, and random coil, which is basically spaghetti floating in the solution. We could go for even more detail, and place proteins on a spectrum, from fully ordered to fully disordered, with most of them falling somewhere in between.

Spectrum for protein disorder (Reprinted from Uversky — Intrinsically Disordered Proteins)

Now what allows disordered proteins to behave the way they do? Well, first, they are deficient in amino acid residues with bulky or hydrophobic side chains. That means that side chains can quite easily interact with water as well as with each other, and don’t get in the way of the main chain bending. The residues are quite often charged, with a lot of charged residues in close vicinity. If the charged residues are of the same sign, the electrostatic repulsion further prevents them from tightly packing. Another common element are proline residues which, due to their weird structure, mess with any sort of secondary protein structure.

How these features translate to measurable properties of the proteins, and the ways in which these features differ from their ordered counterparts will be covered in Part 2 of this series. Coming… whenever I get to writing it!

References

Urry L. A. et al. (2021). Campbell Biology 12th edition. Pearson

Voet D., Voet J. G. (2010) Biochemistry 4th edition. John Wiley & Sons

Garrett and Grisham (2016) Biochemistry 6th edition, Cengage ISBN13: 978–1–305–57720–6

Tompa, P. (2010). Structure and function of intrinsically disordered proteins. Chapman & Hall/CRC Press.

Uversky, V. N. (2014). Intrinsically Disordered Proteins. https://doi.org/10.1007/978-3-319-08921-8_1

Dunker, A. K. et al. (2001). Intrinsically disordered protein. Journal of Molecular Graphics and Modelling, 19(1), 26–59. https://doi.org/10.1016/S1093-3263(00)00138-8

Das, P., Matysiak, S., & Mittal, J. (2018). Looking at the Disordered Proteins through the Computational Microscope. ACS Central Science, 4(5), 534–542. https://doi.org/10.1021/ACSCENTSCI.7B00626

Papoian, G. A. (2008). Proteins with weakly funneled energy landscapes challenge the classical structure–function paradigm. Proceedings of the National Academy of Sciences, 105(38), 14237–14238. https://doi.org/10.1073/PNAS.0807977105

--

--

meh guy
Biocord

A biochemist who likes to sometimes pretend he’s smart