Generation and MD Simulation of RNA

Objective and Overview

This tutorial gives an overview of the CHARMM command required to generate and simulate and RNA duplex (PDB Identifer: 1RRR) in explicit solvent. In addition, one example input of the conversion of RNA, which is the default nucleic acid in CHARMM, to DNA is given. Note that all remaining steps of the solvation, minimization and simulation processes are the same in DNA and RNA. An image of the duplex is shown below.

rr_gen.gif
watsodbox.jpg

Topology and parameter files used for the calculation are for the additive all-atom CHARMM force field top_all27_na.rtf and par_all27_na.prm. You can download them and put them in your $CHARMMDATA directory.

One can use the MMTSB ToolSet to prepare the PDBfiles using the command:

convpdb.pl -segnames -model 1 -out param22 1RRR.pdb > rr_1.pdb

Alternatively, the PDB file can be converted to a CHARMM format coordinate file (CRD) as described in the subdirectory a convpdb2crd (see 00readme). (Note to stop the fortran program, after its says Disulphide bonds, use .)

Once the CRD file is created it is generated as shown in the file, gen_rna_duplex_fortran.inp, from which the PSF and corresponding CRD files are obtained. In the case of DNA, the generation steps are the same with the additional step of removing the 2'OH moieties as shown in gen_dna_duplex_fortran.inp.

Small changes to the files gen_rna_duplex_fortran.inp and gen_dna_duplex_fortran.inp allows one to read the pdb file from convpdb.pl. These changes are in gen_rna_duplex.inp.

gen_rna_duplex_fortran.inp
open unit 20 read form name rr_1_noh.pdb.crd
read coor card unit 20
gen_rna_duplex.inp
rename segid n01a select segid rna1 end
rename segid n01b select segid rna2 end
open unit 20 read form name rr_1.pdb
read coor pdb unit 20 resid
rename segid rna1 n01a select segid n01a end
rename segid rna2 select segid n01b end

gen_rna_duplex.inp renames the segment names for each chain to match those in the pdb file and then changes them back so the segments have the same names in the later scripts. First, create a directory named "output" and then execute the gen_rna_duplex.inp script:

mkdir output
$CHARMMEXEC < gen_rna_duplex.inp > gen_rna_duplex.log & 

Once the duplex is generated, a waterbox which overlays the duplex is constructed. The waterbox dimensions are determined by the dimension of the nucleic acid on which the calculation is being performed. Typically, the distance between the nucleic acid and the edge of the solvation cell is to be around 8-9 Å. In the present example script, watsodbox.inp, a water box with sodium ions (to neutralize the RNA) is created, with the resulting box shown below.

$CHARMMEXEC < watsodbox.inp > watsodbox.out 
(note: ensure wat_sod_1b.crd is in working directory)

Once the waterbox with sodium is available, the script, overlay_rna.inp, is used to put the RNA (or DNA) in the waterbox. The script involves placing the waterbox over the duplex, deleting all waters and sodiums that are in contact with the RNA and then checking to insure that the system is electronically neutral. If the system is not neutral sodiums are added at random positions or deleted to obtain a neutral system. Note that the present approach yields a system with only minimal salt. If a higher salt concentration is required it is suggested that a, for example, waterbox containing NaCl, be created and equilibrated via an MD simulations with the resulting "salt-water box" used for setting up the system. An image of the overlaid system is shown below.

$CHARMMEXEC < overlay_rna.inp > overlay_rna.out
sol.png

The remainder of the CHARMM scripts involve equilibration of the system around the RNA followed by the production MD simulation.

$CHARMMEXEC < equil.inp > equil.out & 

During equilbration, equil.inp, the coordinates of the DNA/RNA are harmonically constrained and a minimization is performed. This is followed by a NVT simulation (minimum of 20 ps) to allow the solvent and ions to equilibrate around the nucleic acid. The final coordinates from the equilibration are used for the production simulation, dyn_npt_1.inp.

$CHARMMEXEC nstep=5000 < dyn_npt_1.inp > dyn_npt_1.out & 

The production simulation involves an NPT simulation with long-range electrostatics treated via the Particle-Mesh Ewald method. Note that the trajectory is run in a loop such that a series of trajectory (.trj) files along with the associated restart and energy files are created. This allows the simulation to be readily restarted and facilitates handling of the files during analysis.

It is probably not feasible to run the production on your workstation in a short time. If you wish to try, you will need to reduce the nstep variable on the command line.

Two simple analysis CHARMM scripts to determine the RMS Differences as a function of time (rms.inp) and the base pairing energy as a function of time (bp_interactions.inp) are included in the analysis subdirectory.

Then, run the input files:

cd analysis
cp ../rms.inp .
cp ../bp_interactions.inp .
$CHARMMEXEC nstep=5000 < rms.inp > rms.out
$CHARMMEXEC nstep=5000 < bp_interactions.inp > bp_interactions.out
Written by A.D. Mackerell, Jr.