GROMACS #02 – a protein with a special residue in a membrane inside a water box

1- Force field preparation [needs to be checked for naming consistency]

Prepare a force field including the protein part and the lipid part. In this case we’ll be using Gromos54a7 and the Berger lipid definitions for DPPC. This part is largely based on Lemkul’s tutorial.

The lipid bilayer we will be simulating is DPPC (dipalmitoyl phosphatidylcholine), so we need parameters for it as well. But if pdb2gmx is designed to handle only proteins, nucleic acids, and a finite amount of cofactors, where do we get parameters for this molecule, and how do we apply them to our system?

There is no universal force field for simulating all proteins, nucleic acids, lipids, carbohydrates, and arbitrary small molecules. Since we are simulating a system that contains both protein and lipids, how do we solve the force field problem? Under GROMACS, the most widely-used parameters for the lipid component of membrane-protein simulations are commonly called “Berger lipids,” derived by Berger, Edholm, and Jähnig (ref). These parameters can be combined with a GROMOS representation of the protein.

The so-called “Berger lipids” are somewhat of a hybrid between GROMOS atomtypes and OPLS partial charges. Since the long alkane chains are poorly represented by GROMOS bonded parameters, a Ryckaert-Bellemans dihedral potential is used, and a scaling factor of 0.125 is applied to Lennard-Jones 1-4 interactions. These lipid parameters are distributed by D. Peter Tieleman, through his website. While you are there, download the following files:

dppc128.pdb – the structure of a 128-lipid DPPC bilayer
dppc.itp – the moleculetype definition for DPPC
lipid.itp – Berger lipid parameters

So what exactly is lipid.itp, and how do you use it? Think of this analogy: gromos53a6.ff/forcefield.itp is to your protein what lipid.itp is to the lipids. Essentially, lipid.itp contains all the atom types, nonbonded parameters, and bonded parameters for a large class of lipids, much like how forcefield.itp, ffnonbonded.itp, and ffbonded.itp function in relation to a protein. That said, we cannot simply #include “lipid.itp” within our topology, since it is at the same level (precedence) as forcefield.itp.
[…] Next, copy and paste the entries in the [ atomtypes ], [ nonbond_params ], and [ pairtypes ] sections from lipid.itp into the appropriate headings within ffnonbonded.itp. You will find that the lipid.itp [ atomtypes ] section lacks atomic numbers (the at.num column), so add these in. The newly-modified lines should be:

   LO    8    15.9994      0.000     A  2.36400e-03 1.59000e-06 ;carbonyl O, OPLS
  LOM    8    15.9994      0.000     A  2.36400e-03 1.59000e-06 ;carboxyl O, OPLS
  LNL    7    14.0067      0.000     A  3.35300e-03 3.95100e-06 ;Nitrogen, OPLS
   LC    6    12.0110      0.000     A  4.88800e-03 1.35900e-05 ;Carbonyl C, OPLS
  LH1    6    13.0190      0.000     A  4.03100e-03 1.21400e-05 ;CH1, OPLS
  LH2    6    14.0270      0.000     A  7.00200e-03 2.48300e-05 ;CH2, OPLS
   LP   15    30.9738      0.000     A  9.16000e-03 2.50700e-05 ;phosphor, OPLS
  LOS    8    15.9994      0.000     A  2.56300e-03 1.86800e-06 ;ester oxygen, OPLS
  LP2    6    14.0270      0.000     A  5.87400e-03 2.26500e-05 ;RB CH2, Bergers LJ
  LP3    6    15.0350      0.000     A  8.77700e-03 3.38500e-05 ;RB CH3, Bergers LJ
  LC3    6    15.0350      0.000     A  9.35700e-03 3.60900e-05 ;CH3, OPLS
  LC2    6    14.0270      0.000     A  5.94700e-03 1.79000e-05 ;CH2, OPLS

In the [ nonbond_params ] section, you will find the line “;; parameters for lipid-GROMOS interactions.” Delete this line and all of the subsequent lines in this section. These nonbonded combinations are specific to the deprecated ffgmx force field. Removing these lines allows the interactions between the protein and the lipids to be generated by the standard combination rules of GROMOS96 53A6. Non-bonded interactions involving atom type HW are also present; since these are all zero you can delete these lines as well, or otherwise rename HW as H to be consistent with the GROMOS96 53A6 naming convention. If you do not rename these lines or remove them, grompp will later fail with a fatal error.
Append the contents of the [ dihedraltypes ] to the corresponding section of ffbonded.itp. Do not be concerned that these lines look a bit different. They are Ryckaert-Bellemans dihedrals, which differ in form from the standard periodic dihedrals used in the default GROMOS96 53A6 force field. Finally, change the #include statement in your topol.top from:

#include "gromos53a6.ff/forcefield.itp"
to:
#include "gromos53a6_lipid.ff/forcefield.itp"
Lastly, we need to include the specific parameters for our DPPC molecules. The process for doing so is quite simple. Just add the line #include “dppc.itp” in your topol.top, somewhere after the position restraints section for the protein, which defines the end of the protein [ moleculetype ] definition. Doing so is analogous to adding any other small molecule or solvent into the topology. In this section and throughout this tutorial, text in green will denote lines that you should add, while other text (in black) are lines that should already be present in the topology prior to the modification indicated.

; Include Position restraint file
#ifdef POSRES
#include "posre.itp"
#endif
; Include DPPC chain topology
#include "dppc.itp"
; Include water topology
#include "gromos53a6_lipid.ff/spc.itp"

[…] Regardless of which setup you choose, you must demonstrate that your model is reasonable. The particular parameter combinations described here have been shown to work in-house and according to reports by other users. It is up to you to convince your audience (i.e., reviewers) that you know what you are doing and that your model is valid.
If you have correctly followed all of the steps above, you will have a fully-functional force field that can be used to process other membrane proteins with pdb2gmx. Doing so removes the need to manually hack topol.top after the fact. Placing the new gromos53a6_lipid.ff directory in $GMXLIB will allow you to use this force field system-wide.

2- Align the protein with the z axis

gmx editconf -f protein_complete.pdb -o protein_align.pdb -princ -rotate 0 90 0
> Select group for the determining the orientation
> $0: System

#1: princ aligns the protein with the x asis (5.1.2) and rotate rotates the protein 90 degress around the y axis

#2: sometimes it helps to pre-optimize the protein in a water box

3- Process the protein

gmx pdb2gmx -f protein_align.pdb -o protein_proc.pdb -ter -water spc -ff ze_gromos54a7_atb_berger_fcy  -p topology.top
#This force field contains the Berger lipids definitions and an extra protein residue, FCY, for S-farnesyl-cysteine (see below)
> $ Select start terminus type for MET-1: 1: NH2
> $ Select end terminus type for MET-173: 1: COOH

4- Modify the topology: add the DPPC topology

cp /usr/share/gromacs/top/ze_gromos54a7_atb_berger_fcy.ff/dppc.itp .
edit topology.top
; Include Position restraint file
#ifdef POSRES
#include "posre.itp"
#endif

; Include DPPC chain topology
#include "dppc.itp"

; Include water topology
#include "./gromos54a7_atb_2.ff/spc.itp"

 5- Orient the protein and the membrane in the same coordinate frame

[WIP]