Folding time for the HP model

1. The purpose of this problem is to give you some idea for the time scales for protein folding. As with the last problem, we'll study the HP model but now on a 3d cubic lattice. This paper Links to an external site. illustrates how slow the "HP model" is even for moderate size chains.

In general, it is hard to find efficient methods to find the ground states of protein models. However for the HP model, there have been some relatively fast algorithms that have been developed Links to an external site.. These are not meant to mimic the way nature folds proteins, but clever short-cuts that manage to find minimum energy states it a much smaller number of operations. Physical folding is much closer to "short range" Monte Carlo moves, where individuals monomers are rotated or flipped in one step. hw7/lattice_protein.py implements such a model, but uses C code to do the actual Monte Carlo moves where as in the last problem, it was all implemented in python. The code uses "weave" in order to compile and execute C code. Some implementations of SciPy do not appear to support weave so you should check that this works on your system.

The Monte Carlo code simulates a chain at finite temperature. In order to find the ground state, we can cool it down slowly. The lower the temperature, the slower we adjust the temperature, to give it more time to thoroughly explore configuration space.

The particular sequence used in the code comes from a bet made (discussed in the above paper of a 6-pack of beer) between a group in Harvard and one in UCSF. The Harvard group attempted to design an HP sequence that would fold into a structure that they chose. They thought they had an algorithm for doing this, and that it would be impossible for any other group to find their structure by Monte Carlo or other means. As it turned out, 9/10 times the UCSF group could find lower energies than expected by the Harvard design program.

In this code, you'll try to minimize the 10th sequence by Monte Carlo/simulated annealing. The ground state energy is -33. If you run the code and watch the print out of "Lowest energy". If you wait a few minutes, the chain will settle down. Often it won't reach the correct ground state, but sometimes it does. You can adjust this in the code by looking at the simulated annealing parameters: time, beta, b0, and t0. The way that they work are as follows. In the beginning you set "time", which refers to the number of iterations of the code at the initial value of beta. When you enter the "while beta < 10.0:" loop, the value of time is multiplied by t0, and beta is multiplied by b0. So the time at higher beta (lower temperature) is being increased, which makes physical sense.

What happens if you reduce the value of t0? Does the simulation do better or worse?

In the graphics, the red balls are H's and the green are P's. Can you describe in general features you see for the minimized structures?

2. Connection to real systems.

What are typical folding times for proteins? How does this compare with the typical time for a ribosome to synthesize a protein? Is the folding time long or short on a microscopic time scale?
Suppose you try to minimize the HP model using Monte Carlo, and get a low energy structure with a fake alpha helix somewhere in your protein. But you can tell that the energy would be lowered if the alpha helix was to be rotated 90 degrees. Will your lattice simulation be able to easily rotate the helix? What happens if instead your model was not confined to a lattice?
For a random sequence of H's and P's, how to you expect the folding time to depend on chain length? What does this imply about biological sequences (which presumably aren't random).