MMTSB Tool Set - Ab initio protein structure prediction

This tutorial will illustrate how to use the MMTSB Tool Set to perform ab initio protein structure prediction with MONSSTER without using a template.

As an example we will fold the protein BPTI (bovine pancreatic trypsin inhibtor, PDB code: 5PTI) from extended conformations with the help of secondary structure information and knowledge of the three disulfide bonds that are present in the native protein.

1. Exploration of the SICHO model and MONSSTER

Begin by copying/downloading the PDB file for BPTI (5PTI).

To prepare for this exercise extract the amino acid sequence with

    genseq.pl -out one 5PTI.pdb > sequence

and predict the secondary structure with PSIPRED:

    psipred.pl sequence > secondary.prediction

From the sequence file and secondary structure prediction generate a MONSSTER sequence file that contains both the amino acid sequence and secondary structure assignments:

    genseq.pl -2ndone secondary.prediction -one sequence > monsster.seq

First, we will test the resolution of the SICHO model. Generate a SICHO representation of the native structure of BPTI with:

    genchain.pl 5PTI.pdb > sicho.chain

Now, rebuild an all-atom structure from the SICHO chain and the sequence file:

    rebuild.pl monsster.seq sicho.chain > allatom.rebuilt.pdb

How far does the rebuilt model deviate from the initial structure? Use rms.pl to find out:

    rms.pl -fit -out CA 5PTI.pdb allatom.rebuilt.pdb

This result will give you an idea about the SICHO representation, but what about the accuracy of the energy function? In order to answer this question we will run a short lattice simulation with MONSSTER starting from the native conformation:

    latticesim.pl -par ncycle=20 -const 1.0 -chain sicho.chain monsster.seq

After this command has finished the final conformation can be found in monsster.final.chain. Rebuild an all-atom structure as before and compare with the native. How far does this structure deviate? The larger deviation of around 5 A indicates that the native state is not exactly at the minimum of the energy function. It also suggests that it would be very difficult to obtain ab initio predictions from MONSSTER simulations that are substantially closer to the native state.

2. Ab initio sampling of BPTI

Using only the sequence, the predicted secondary structure, and restraints representing the three disulfide linkages, we will now generate a number of conformations from simulated annealing MONSSTER runs:

    enslatsim.pl -seq monsster.seq -sa 2.0 \
                 -par tsteps=8,ncycle=5 -run 50 -d 2.0 5:55=14:38=30:51 \
                 -dir sampling -rnd

This command will take some time to finish.

The resulting structures can now be energy minimized:

    ensmin.pl -par dielec=rdie,epsilon=4,minsteps=100 -dir sampling lat min

... evaluated with an MMGB/SA function ...

 
    ensrun.pl -set score:1 -dir sampling min enerCHARMM.pl -par gb,nocut

... and clustered:

    enscluster.pl -kclust -radius 10 -dir sampling min

With the tool bestcluster.pl it is then possible to obtain the cluster of structures with the lowest average score:

    bestcluster.pl -prop score -dir sampling min

and the structure with the lowest score from the cluster with the lowest average score (and more than one cluster element):

    ensfiles.pl -cluster  -sort score -dir sampling min

Take a look at the resulting structure with VMD. Does it look like a protein?

Since we know the native structure, we can also compare all of the sampled conformations with the native structure:

    calcprop.pl -natpdb 5PTI.pdb -dir sampling min

Obtain a list of RMSD vs. energy with

    getprop.pl -prop rmsdca,score -dir sampling min

and graph the result. Is this scoring function a good measure to indicate which structures are most native-like? How does the structure identified with the scoring/clustering protocol compare to the best structure from the entire ensemble?