This tutorial will illustrate how to use the MMTSB Tool Set to perform
ab initio protein structure prediction with MONSSTER without using a
template.
As an example we will fold the protein BPTI (bovine pancreatic trypsin
inhibtor, PDB code: 5PTI) from extended conformations with the help of secondary
structure information and knowledge of the three disulfide bonds that are
present in the native protein.
1. Exploration of the SICHO model and MONSSTER
Begin by copying/downloading the PDB file for BPTI (5PTI).
To prepare for this exercise extract the amino acid sequence with
genseq.pl -out one 5PTI.pdb > sequence
and predict the secondary structure with PSIPRED:
psipred.pl sequence > secondary.prediction
From the sequence file and secondary structure prediction generate a MONSSTER
sequence file that contains both the amino acid sequence and secondary structure
assignments:
genseq.pl -2ndone secondary.prediction -one sequence > monsster.seq
First, we will test the resolution of the SICHO model. Generate a SICHO
representation of the native structure of BPTI with:
genchain.pl 5PTI.pdb > sicho.chain
Now, rebuild an all-atom structure from the SICHO chain and the sequence
file:
rebuild.pl monsster.seq sicho.chain > allatom.rebuilt.pdb
How far does the rebuilt model deviate from the initial structure? Use
rms.pl to find out:
rms.pl -fit -out CA 5PTI.pdb allatom.rebuilt.pdb
This result will give you an idea about the SICHO representation, but what
about the accuracy of the energy function? In order to answer this question
we will run a short lattice simulation with MONSSTER starting from the native
conformation:
latticesim.pl -par ncycle=20 -const 1.0 -chain sicho.chain monsster.seq
After this command has finished the final conformation can be found in
monsster.final.chain. Rebuild an all-atom structure as before
and compare with the native. How far does this structure deviate? The
larger deviation of around 5 A indicates that the native state is not
exactly at the minimum of the energy function. It also suggests that
it would be very difficult to obtain ab initio predictions from MONSSTER
simulations that are substantially closer to the native state.
2. Ab initio sampling of BPTI
Using only the sequence, the predicted secondary structure, and restraints
representing the three disulfide linkages, we will now generate a number
of conformations from simulated annealing MONSSTER runs:
enslatsim.pl -seq monsster.seq -sa 2.0 \
-par tsteps=8,ncycle=5 -run 50 -d 2.0 5:55=14:38=30:51 \
-dir sampling -rnd
This command will take some time to finish.
The resulting structures can now be energy minimized:
ensmin.pl -par dielec=rdie,epsilon=4,minsteps=100 -dir sampling lat min
... evaluated with an MMGB/SA function ...
ensrun.pl -set score:1 -dir sampling min enerCHARMM.pl -par gb,nocut
... and clustered:
enscluster.pl -kclust -radius 10 -dir sampling min
With the tool
bestcluster.pl it is then possible to obtain the
cluster of structures with the lowest average score:
bestcluster.pl -prop score -dir sampling min
and the structure with the lowest score from the cluster with the lowest
average score (and more than one cluster element):
ensfiles.pl -cluster -sort score -dir sampling min
Take a look at the resulting structure with VMD. Does it look like a protein?
Since we know the native structure, we can also compare all of the sampled
conformations with the native structure:
calcprop.pl -natpdb 5PTI.pdb -dir sampling min
Obtain a list of RMSD vs. energy with
getprop.pl -prop rmsdca,score -dir sampling min
and graph the result. Is this scoring function a good measure to indicate
which structures are most native-like? How does the structure identified with
the scoring/clustering protocol compare to the best structure from the
entire ensemble?