Instructions for using the protein hydrogen-bonding database potential (HBDB) in structure refinement

 Alexander Grishaev,


March 14, 2008



The goal of refinement against HBDB is to force the atomic fragments linked by the backbone-backbone hydrogen bonds in proteins to accept relative geometries that match those observed in a database of high-quality high-resolution X-ray structures. Based on NMR cross validation, this procedure generally yields to an improvement of structural accuracy. The HBDB refinement can be done with both CNS and Xplor-NIH; the input scripts for both are included. The refinement can be done in two modes – with the program automatically finding the HN/O pairs linked by the backbone-backbone HBs, or with the program adhering to a list of user-specified HBs to be enforced (manual mode). Both modes are described in the example scripts provided. The only adjustable parameters that need to be optimized are the force constant for the two potential terms used by HBDB. They are adjusted in order to yield final energies that are compatible with the X-ray database. Scripts and programs used for structure refinement including HBDB are listed in the archive.


Software installation

Most recent releases of Xplor-NIH have HBDB included, so no hbdb-specific software installation is necessary. For CNS, the user would need to recompile the distribution including the hbdb-related files. In order to do this, you’ll need to place hbdb.f and into the CNS source directory and copy the hbdb-related code into your cns.f , energy.f and (these files are also included so that you could see where the code goes). Then you’ll need to add hbdb.obj to the Makefile in the same way that all other entries are done within this file. Finally, you recompile – I run the command “nmake” under Windows from the command line within the source directory. The compilation should produce an executable (in Windows – CNS_solve.exe), which you can take to your working directory. You can also send me an email if you need a pre-compiled CNS executable that includes HBDB code and runs under Windows.


Preparation of the files necessary for the refinement.

The directory used for the refinement should have a file (hbdb_files.dat is included as an example) that points to the location of the files that contain the potentials used by the HBDB – they are the 11 hbpot_xyz*.dat files and the 4 era*.dat files. The input scripts should contain references to the hbdb code – see the sa_hbdb*.inp files provided as examples with this distribution. At the very minimum, hbdb needs to be initialized and included in the list of the active energy terms; the statistics of the final structures can be printed in the file header with the “hbdb print end” statement, as shown in the example input scripts. In the case of using the fixed list of h-bond as input, the corresponding file needs to be prepared. The included examples - hbdb_list*.dat have the correct syntax which needs to be followed exactly.


Structure refinement with the HBDB term

The refinement is best done in several cycles in order to optimize the force constant settings and the contents of the input H-bond list, when the manual mode is used. Violations of the input distance restraints should be monitored when HBDB is applied – persistent violations may be due to misassignments or incorrect restraint bounds. The automated mode for the detection of H-bonds to be enforced is the easiest to set up as it does not require the user-specified list of H-bonds, and, in some sense, is the most objective way to detect them. Its downside is that it might in some cases over-detect or under-detect some H-bonds since both the detection of H-bond and the selection of the potential type in this mode is based on the HN/O proximity and the pattern of surrounding detected H-bonds at each instant in the refinement. Both of these are affected by the structural accuracy, so mistakes can be made. A recommended strategy is to run the refinement in the automated mode first and then switch to the manual mode using those H-bonds detected by the automated mode that are refineable. In practice “refineable” means that the site-specific energies for those selected for the manual mode should be negative for the three-dimensional potential and below 4.0 for the two-dimensional potential at the end of the automated mode refinement. The example scripts included in the archive have the “hbdb print end” statement that prints HBDB statistics – the list of active H-bonds and their energies – into the header of the output pdb file. An inspection of this info should make it clear which H-bonds are within the database distributions and which are not. The  automated mode occasionally assigns erroneous i/i+2 H-bonds and prior to enforcing such HBs one should carefully evaluate the Ramachandran statistics in their vicinity. Once either the set of refined hydrogen bonds becomes stable from iteration to iteration when running it in an automated mode, or the list of refined H-bonds stops changing when running it in manual mode, the force constant for the HBDB can be optimized. The goal of optimization is to bring the average value of the 3D (directional) potential, as reported by the HBDB module, to below ~ -4.4, and to bring the average value of the 2D (O…H-N linearity) potential to below ~0.7. Any values below -4.5 for the 3D potential and below 0.6 for the 2D potential are equally and perfectly good. Therefore the force constants need to be adjusted up or down to reach these ranges. A typical range for the 3D force constant is 0.2-1.0. Typical ratios between the 3D and 2D force constants are between 2:1 and 4:1. An example of a script containing HBDB terms in the automated form is listed below. HBDB-related parts are in blue.


define( md.seed=823641; )
set seed=&md.seed end
set message=on echo=on end
parameter @prot.par @axis.par end
structure @1y8b.mtf @axis.mtf end
coor @msg_init_hbdb3.pdb
dele sele (not known) end
coor copy end 


 kdir = 0.25 ! directional (3d)     force const - final energy per HB should be ~-4.5 kT
 klin = 0.08 ! linearity (E(ohn|oh) force const - final energy per HB should be ~ 0.5 kT
 nseg = 1    ! number of active segments - one chain
 nmin =   3  ! min res # to include in the HBDB
 nmax = 722  ! max res # to include in the HBDB
 ohcut   =  2.60 ! DO NOT MODIFY!
 coh1cut = 100.0 ! DO NOT MODIFY!
 coh2cut = 100.0 ! DO NOT MODIFY!
 ohncut  = 100.0 ! DO NOT MODIFY!
 updfrq = 1000   ! update frequency for the automated mode
 prnfrq = 1000   ! print frequency
 freemode  = 1   ! free mode flag       (0=off,1=on)
 fixedmode = 0   ! fixed list mode flag (0=off,1=on)
 mfdir  = 0      ! DO NOT MODIFY!
 mflin  = 0      ! DO NOT MODIFY!
 kmfd   = 0.0    ! DO NOT MODIFY!
 kmfl   = 0.0    ! DO NOT MODIFY!
 renf =  2.3     ! can be 2.2-2.3
 kenf = 30.0     ! can be 20-50
 @hbdb_files.dat ! location of the PMF files


flags exclude * include bond angle impr vdw noe cdih coll dcsa sani hbdb end
igroup interaction (all) (all) weights * 1 end end


 evaluate ($filename="msg_hbdb_"+encode($ifile)+".pdb")
 set print=$filename end
 hbdb print end
write coordinates output =$filename end
 evaluate ($ifile = $ifile +1)

The following statement would be used to invoke HBDB in the manual list mode.

 kdir = 0.75     ! force constant for the directional (3D) potential
 klin = 0.25     ! force constant for the linearity (2D) potential
 nseg =   2      ! number of segments active in hbdb
 nmin =  20      ! min resid for segment 1
 nmax =  92      ! max resid for segment 1
 segm =   A      ! chain ID for segment  1
 nmin =  20      ! min resid for segment 2
 nmax =  92      ! max resid for segment 1
 segm =   B      ! chain ID for segment  1
 ohcut   =  2.60 ! DO NOT MODIFY!
 coh1cut = 100.0 ! DO NOT MODIFY!
 coh2cut = 100.0 ! DO NOT MODIFY!
 ohncut  = 100.0 ! DO NOT MODIFY!
 updfrq =   1000 ! update frequency for the automated mode
 prnfrq =   1000 ! print frequency
 freemode  = 0   ! automated mode flag (0=off, 1=on)
 fixedmode = 1   ! fixed     mode flag (0=off, 1=on)
 mfdir  = 0      ! DO NOT MODIFY!
 mflin  = 0      ! DO NOT MODIFY!
 kmfd   =  0.0   ! DO NOT MODIFY!
 kmfl   =  0.0   ! DO NOT MODIFY!
 renf =    2.2   ! can be 2.2-2.3
 kenf =   20.0   ! can be 20-50
 @hbdb_files.dat ! file listing the location of the PMF files
 @hbdb0731.tbl   ! HBDB restraint list


Here, the HBDB restraints are in the following format:

assign (don and resid 35 and segid A and name HN )
          (acc and resid 42 and segid A and  name O )


[ Home ] [ NIH ] [ NIDDK ] [ Disclaimer ] [ Copyright ]
last updated:  nov 2009 / Webmaster