This tutorial will illustrate how to use Modeller in combination with the MMTSB Tool Set to model
a missing loop in a given protein structure.
We will revisit the structure of ribonuclease A from PDB code 1RNU,
shown on the right. In this structure, residues 16-23 are not resolved. These residues will
be modeled in this tutorial.
|
|
1. Preparation of input files
Copy the PDB file for
1RNU and extract the protein chain with
the following command:
convpdb.pl -nsel protein 1RNU.pdb > 1rnu.pdb
2. Generation of loop models
We will now call Modeller to generate 200 different models for residues 16 through 23. The sequence for
the missing residues is STSAASSS. The command is:
loopModel.pl -models 200 -loop 16:STSAASSS 1rnu.pdb > modeller.scores
This command will take 10-20 minutes to finish. When it is done, the current directory will contain the
completed structures in files model.1.pdb through model.200.pdb. Furthermore, the output file
modeller.scores
contains the scores that Modeller assigned for each of the structures.
3. Model analysis
We will proceed by checking the structures into an ensemble in order to faciliate further analysis:
checkin.pl -dir ens model model.?.pdb model.??.pdb model.???.pdb
We can associate the scores from the output file
modeller.scores (in the third column)
with the following command:
setprop.pl -dir ens -f modeller.scores -inx 3 model score
Let us now cluster the conformations based only on the differences for residues 16 through 23:
enscluster.pl -kclust -l 16:23 -nolsqfit -radius 2 -dir ens model
All of the structures should be oriented in the same way so that we do not need to superimpose
the structures before comparing them.
Check the clusters that have been generated with the following command:
showcluster.pl -dir ens model
Now we can use
bestcluster.pl to find the best cluster (according to the average Modeller score):
bestcluster.pl -dir ens -prop score model
4. All-atom scoring
In addition to the Modeller score we will also calculate CHARMM implicit solvent free energy estimates. This
requires that we minimize the structures briefly. The minimization of all the structures in the ensemble is
accomplished with:
ensmin.pl -l 1rnu.pdb 16:23 -par dielec=rdie,epsilon=4 -dir ens model min
This command will only minimize the loop conformations but keep the rest of the structure according to the
input structure
1rnu.pdb. We will use distance-dependent dielectric here to minimize the computational costs,
but even with this option it will take a few minutes to complete this step. If you have multiple CPUs/cores, you
can use the option
-cpus N to parallelize the calculation. The result is a new set of structures
with the new tag
min.
Next, we will calculate the energeies for the minimized structures with:
enseval.pl -par gb,nocut -set mmgbsa=total -dir ens min
The all-atom energies are not available under the
mmgbsa for the
min tag. In order to compare
with the Modeller scores and use the clustering information for the original structures (under the
model tag)
we can transfer the data with the followign command:
getprop.pl -dir ens -prop mmgbsa min | setprop.pl -dir ens -f - -inx 2 model mmgbsa
Now we can try the
bestcluster.pl command again, but this time using the all-atom score:
bestcluster.pl -dir ens -prop mmgbsa model
Are the cluster ranked differently?
5. Comparison with experimental structure
A newer structure of ribonuclease A (
7RSA) actually contains a structure for residues 16 through 23.
Let's copy the PDB file for 7RSA and compare how much the predicted structures deviate from the experimental structure:
ensrun.pl -set rmsca:1 -dir ens model rms.pl -fitxl -l 16:23 -out CA `pwd`/7RSA.pdb
With the previous command we calculate the coordinate root mean square deviation for residues 16 through 23 after superimposing
the rest of the structure. The result is stored under the name
rmsca.
We can now use
bestcluster.pl again to find out whether the best-scoring cluster also corresponds to the cluster with
the lowest RMSD:
bestcluster.pl -dir ens -prop rmsca model
Find the models with the lowest RMSD and with the best scores and compare them visually using VMD. You can use
ensfiles.pl
as in the following command to find the best structures:
ensfiles.pl -cluster t.3 -sort rmsca -dir ens model
ensfiles.pl -cluster t.3 -sort score -dir ens model
ensfiles.pl -cluster t.3 -sort mmgbsa -dir ens model