MMTSB Tool Set - Ensemble computing

Objective and Overview

This tutorial will explore the ensemble computing capabilities of the MMTSB Tool Set. As an example the WW domain from Yes-associated protein (Yap), for which a structure has been reported in complex with a bound proline rich peptide is used to estimate the binding energy using molecular dynamics and MM/GB binding energy estimates.

This tutorial will illustrate how we can use the MMTSB Tool Set to run an MD simulation of the protein-ligand complex, create an ensemble structure from this dynamics trajectory and then use the ensemble analysis tools in the MMTSB Tools Set to calculate an approximate binding energy.

Binding energies will be estimated following the MMPB/SA or MMGB/SA scheme. From the conformations sampled during a molecular dynamics simulation of the complex average energies are calculated separately for the complex, the receptor, and the substrate. The binding energy can then be estimated from the difference:

ΔΔG(binding) = ΔG(complex) - ΔG(receptor) - ΔG(ligand)

1. Initial system setup

Obtain/copy the experimental structure of the YAP-WW-domain from the Protein Data Bank. The PDB code is 1JMQ

Because the structure was solved by NMR spectroscopy there are multiple models in the PDB file. Extract the first model with: -model 1 1JMQ.pdb > yapww.pdb
Next, we will minimize the experimental structure with distance dependent dielectric as a fast approximation of the solvent environment: -par dielec=rdie,epsilon=4,minsteps=50 \
                 -log min.log yapww.pdb > yapww.min.pdb
We are now ready to run a molecular dynamics simulation of the complex to obtain conformational samples for the MMGB/SA analysis. Because no explicit solvent is present, the simulation will be run with GB (GBMV). Also, this simulation is very short for this type of analysis, but it will suffice for the purposes of this tutorial. -par dynsteps=2000,dynoutfrq=100 -trajout yapww.dcd \
                -log dynamics.log -final yapww.min.pdb
For the subsequent analysis we will take advantage of the ensemble computing facility in the MMTSB Tool Set. In order to use the ensemble computing tools we have to first generate an ensemble directory structure from the trajectory file. This can be done with the following command: -ensdir ens -ens complex yapww.dcd
In this case the ensemble data is stored in a subdirectory 'ens' and the conformations from the trajectory are available through the 'complex' tag.

Next, we will extract the substrate (chain P) and receptor (chain A) for each complex into separate files for later analysis: -dir ens -new substrate complex -chain P -dir ens -new receptor complex -chain A
We now have three files for each snapshot, complex.pdb, receptor.pdb, and substrate.pdb. You should find these files in any of the ensemble subdirectories. You will see, however, that the files are automatically compressed to preserve space.

We are now ready to evaluate the energies that we need for the binding free energy estimate. First, we calculate the total energy for the complex: -dir ens -set dgcomplex:1 complex \
              -par gb,nocut complex.pdb
Second, we will estimate the energies for the substrate and receptor alone: -dir ens -set dgreceptor:1 complex \
              -par gb,nocut receptor.pdb -dir ens -set dgsubstrate:1 complex \
              -par gb,nocut substrate.pdb
We can take a look at the results with -dir ens -prop dgcomplex,dgreceptor,dgsubstrate complex
... or combine results to get the binding free energy for each snapshot -dir ens -prop dgcomplex-dgreceptor-dgsubstrate complex
... or obtain the average value: -dir ens -score avg \
               -prop dgcomplex-dgreceptor-dgsubstrate complex 
The result is likely around -30 kcal/mol. That number seems fairly large, but in our analysis so far we have neglected entropic effects related to the substrate and receptor (the entropic effects of the solvent are included in the implicit solvent model).

Written by M. Feig