# IV. The Protein Folding Problem

“Perhaps the most remarkable features of the molecule are its complexity and its lack of symmetry. The arrangement seems to be almost totally lacking in the kind of regularities which one instinctively anticipates, and it is more complicated than has been predicated by any theory of protein structure. Though the detailed principles of construction do not yet emerge, we may hope that they will do so at a later stage of the analysis.” – John Kendrew et al. upon seeing the structure of the protein myoglobin under an electron microscope for the first time, via “The Protein Folding Problem, 50 Years On” by Ken Dill

DNA exists in every cell in every living organism. Not only is it some 3 billion nucleotides long, but it encodes 33,000 genes which express over 1 million proteins. There are several kinds of processes that ‘repeat’ or copy the nucleotides sequences in DNA:

1.) DNA is replicated into additional DNA for cell division (mitosis)

2.) DNA is transcribed into RNA for transport outside the nucleus

3.) RNA is translated into protein molecules in the cytoplasm of the cell – by NobelPrize.org

Furthermore, RNA does not only play a role in protein synthesis. Many types of RNA are catalytic – they act like enzymes to help reactions proceed faster. Also, many other types of RNA play complex regulatory roles in cells (see this for more: the central dogma of molecular biology).

Genes act as recipes for protein molecules. Proteins are long chains of amino acids that become biologically active only after they fold. While often depicted as messy squiggly strands lacking any symmetry, they ultimately fold very specifically into beautifully organized highly complex 3-dimensional shapes such as micro pumps, bi-pedaled walkers called kinesins, whip-like flagella that propel the cell, enzymes and other micro-machinery. The proteins that are created ultimately determine the function of the cell.

Figure 10: This TEDx video by Ken Dill gives an excellent introduction to the protein folding problem and shows the amazing dynamical forms these proteins take.

The protein folding problem has been one of the great puzzles in science for 50 years. The questions it poses are:

1. “How does the amino acid sequence influence the folding to form a 3-D structure?
2. There are a nearly infinite number of ways a protein can fold, how can proteins fold to the correct structure so fast (nanoseconds for some)?
3. Can we simulate proteins with computers?”
– from The Protein-Folding Problem, 50 Years On by Ken Dill

Nowadays scientists understand a great number of proteins, but several questions remain unanswered. For example, Anfinsen’s dogma is the postulate that the amino acid sequence alone determines the folded structure of the protein – we do not know if this is true. We also know that molecular chaperones help other proteins to fold, but are thought not to influence the protein’s final folded structure. We can produce computer simulations of how proteins fold. However, this is only possible in special cases of simple proteins where there is an energy gradient leading the protein downhill to a global configuration of minimal energy [see figure 11]. Even in these cases, the simulations do not accurately predict protein stabilities or thermodynamic properties.

Figure 11: This graph shows the energy landscape for some proteins. When the landscape is reasonably smoothly downhill like this, protein folding can be simulated. Graph By Thomas Splettstoesser (www.scistyle.com) via Wikimedia Commons

Figure 12: A TED Video (short) by David Bolinsky showing the complexity of the protein micro-machinery working away inside the cell. Despite all this complexity, organization, and beauty, little is understood about how proteins fold to form these amazing machines.

Protein folding generally happens in a fraction of a second (nanoseconds in some cases), which is mind boggling given the number of ways it could fold. This is known as Levinthal’s paradox, posited in 1969:

“To put this in perspective, a relatively small protein of only 100 amino acids can take some $10^{100}$ different configurations. If it tried these shapes at the rate of 100 billion a second, it would take longer than the age of the universe to find the correct one. Just how these molecules do the job in nanoseconds, nobody knows.” – Technology Review.com, “Physicists discover quantum law of protein folding”

The Arrhenius equation is used to estimate chemical reaction rates as a function of temperature. Turns out the application of this equation to protein folding misses badly. In 2011, L. Luo and J. Lu published a paper entitled “Temperature Dependence of Protein Folding Deduced from Quantum Transition“. They show that quantum mechanics can be used to correctly predict the proper temperature dependence of protein folding rates (hat tip chemistry.stackexchange.com). Further, globular proteins (not the structural or enzymatic kind) are known to be marginally stable, meaning that there is very little energy difference between the folded, native state, and the unfolded state. This kind of energy landscape may open the door to a host of quantum properties.