Generation and MD Simulation of RNA

Objective and Overview

This tutorial gives an overview of the CHARMM command required to generate and simulate and RNA duplex (PDB Identifer: 1RRR) in explicit solvent. In addition, one example input of the conversion of RNA, which is the default nucleic acid in CHARMM, to DNA is given. Note that all remaining steps of the solvation, minimization and simulation processes are the same in DNA and RNA. An image of the duplex is shown below.

rr_gen.gif
watsodbox.jpg

Topology and parameter files used for the calculation are for the additive all-atom CHARMM force field top_all27_na.rtf and par_all27_na.prm. Initially, the PDB file is converted to a CHARMM format coordinate file (CRD) as described in the subdirectory a convpdb2crd (see 00readme). (Note to stop the fortran program, after its says Disulphide bonds, use .)

Once the CRD file is created it is generated as shown in the file, gen_rna_duplex.inp, from which the PSF and corresponding CRD files are obtained. In the case of DNA, the generation steps are the same with the additional step of removing the 2'OH moieties as shown in gen_dna_duplex.inp.

One can also use the MMTSB Tool convpdb.pl to perform this initial task. The command:

convpdb.pl -segnames -model 1 -out param22 1RRR.pdb > rr_1.pdb

produces a suitable pdb file with the needed translations. Small changes to the files gen_rna_duplex.inp and gen_dna_duplex.inp will allow one to read the pdb file from convpdb.pl. The needed changes are:

In original file replace:

open unit 20 read form name rr_1_noh.pdb.crd
read coor card unit 20

with:

rename segid n01a select segid rna1 end
rename segid n01b select segid rna2 end
open unit 20 read form name rr_1.pdb
read coor pdb unit 20 resid
rename segid rna1 n01a select segid n01a end
rename segid rna2 select segid n01b end

This renames the segment names for each chain to match those in the pdb file and then changes them back so the segments have the same names in the later scripts.

Once the duplex is generated waterbox to overlay the duplex is generated, . The box dimensions are determined by the dimension of the nucleic acid on which the calculation is being performed. Typically, the distance between the nucleic acid and the edge of the solvation cell is to be around 8-9 Å. In the present example script, watsodbox.inp, a water box with sodium ions (to neutralize the RNA) is created, with the resulting box shown below.

Once the waterbox with sodium is available, the script, overlay_rna.inp, is used to put the RNA (or DNA) in the waterbox. The script involves placing the waterbox over the duplex, deleting all waters and sodiums that are in contact with the RNA and then checking to insure that the system is electronically neutral. If the system is not neutral sodiums are added at random positions or deleted to obtain a neutral system. Note that the present approach yields a system with only minimal salt. If a higher salt concentration is required it is suggested that a, for example, waterbox containing NaCl, be created and equilibrated via an MD simulations with the resulting "salt-water box" used for setting up the system. An image of the overlaid system is shown below.

sol.png

The remainder of the CHARMM scripts involve equilibration of the system around the RNA followed by the production MD simulation. During equilbration, equil.inp, the coordinates of the DNA/RNA are harmonically constrained and a minimization is performed. This is followed by a NVT simulation (minimum of 20 ps) to allow the solvent and ions to equilibrate around the nucleic acid. The final coordinates from the equilibration are used for the production simulation, dyn_npt_1.inp. The production simulation involves an NPT simulation with long-range electrostatics treated via the Particle-Mesh Ewald method. Note that the trajectory is run in a loop such that a series of trajectory (.trj) files along with the associated restart and energy files are created. This allows the simulation to be readily restarted and facilitates handling of the files during analysis.

It is probably not feasible to run the production on your workstation in a short time. If you wish to try, you will need to set the nstep variable in the input file dyn_npt_1.inp.

Two simple analysis CHARMM scripts to determine the RMS Differences as a function of time (rms.inp) and the base pairing energy as a function of time (bp_interactions.inp) are included in the analysis subdirectory.

Written by A.D. Mackerell, Jr.