MMTSB Tool Set - Ab initio protein structure prediction

Objective and Overview

This tutorial will illustrate how to use the MMTSB Tool Set to perform ab initio protein structure prediction with MONSSTER without using a template.

As an example we will fold the protein BPTI (bovine pancreatic trypsin inhibtor, PDB code: 5PTI) from extended conformations with the help of secondary structure information and knowledge of the three disulfide bonds that are present in the native protein.

1. Exploration of the SICHO model and MONSSTER

Begin by copying/downloading the PDB file for BPTI (5PTI).

To prepare for this exercise extract the amino acid sequence with
    genseq.pl -out one 5PTI.pdb > sequence
    
and predict the secondary structure with PSIPRED:
    psipred.pl sequence > secondary.prediction
    
From the sequence file and secondary structure prediction generate a MONSSTER sequence file that contains both the amino acid sequence and secondary structure assignments:
    genseq.pl -2ndone secondary.prediction -one sequence > monsster.seq
    
First, we will test the resolution of the SICHO model. Generate a SICHO representation of the native structure of BPTI with:
    genchain.pl 5PTI.pdb > sicho.chain
    
Now, rebuild an all-atom structure from the SICHO chain and the sequence file:
    rebuild.pl monsster.seq sicho.chain > allatom.rebuilt.pdb
    
How far does the rebuilt model deviate from the initial structure? Use rms.pl to find out:
    rms.pl -fit -out CA 5PTI.pdb allatom.rebuilt.pdb
    
This result will give you an idea about the SICHO representation, but what about the accuracy of the energy function? In order to answer this question we will run a short lattice simulation with MONSSTER starting from the native conformation:
    latticesim.pl -par ncycle=20 -const 1.0 -chain sicho.chain monsster.seq
    
After this command has finished the final conformation can be found in monsster.final.chain. Rebuild an all-atom structure as before and compare with the native. How far does this structure deviate? The larger deviation of around 5 A indicates that the native state is not exactly at the minimum of the energy function. It also suggests that it would be very difficult to obtain ab initio predictions from MONSSTER simulations that are substantially closer to the native state.

2. Ab initio sampling of BPTI

Using only the sequence, the predicted secondary structure, and restraints representing the three disulfide linkages, we will now generate a number of conformations from simulated annealing MONSSTER runs:

    enslatsim.pl -seq monsster.seq -sa 2.0 \
                 -par tsteps=8,ncycle=5 -run 50 -d 2.0 5:55=14:38=30:51 \
                 -dir sampling -rnd
    
This command will take some time to finish.

The resulting structures can now be energy minimized:
    ensmin.pl -par dielec=rdie,epsilon=4,minsteps=100 -dir sampling lat min
    
... evaluated with an MMGB/SA function ...
 
    ensrun.pl -set score:1 -dir sampling min enerCHARMM.pl -par gb,nocut
    
... and clustered:
    enscluster.pl -kclust -radius 10 -dir sampling min
    
With the tool bestcluster.pl it is then possible to obtain the cluster of structures with the lowest average score:
    bestcluster.pl -prop score -dir sampling min
    
and the structure with the lowest score from the cluster with the lowest average score (and more than one cluster element):
    ensfiles.pl -cluster  -sort score -dir sampling min
    
Take a look at the resulting structure with VMD. Does it look like a protein?

Since we know the native structure, we can also compare all of the sampled conformations with the native structure:
    calcprop.pl -natpdb 5PTI.pdb -dir sampling min
    
Obtain a list of RMSD vs. energy with
    getprop.pl -prop rmsdca,score -dir sampling min
    
and graph the result. Is this scoring function a good measure to indicate which structures are most native-like? How does the structure identified with the scoring/clustering protocol compare to the best structure from the entire ensemble?

Written by M. Feig