mics Logo From the Bax Group at the National Institutes of Health ...

MICS: Prediction of Protein Structural Motifs from NMR Chemical Shifts
As described in the paper:
    Identification of helix capping and beta-turn motifs from NMR chemical shifts
   Yang Shen, and Ad Bax
   J. Biomol. NMR, 52, 211-232 (2012)

Contact:
   shenyang@niddk.nih.gov
   bax@nih.gov

Web:    http://spin.niddk.nih.gov/bax/software/MICS
Server:    http://spin.niddk.nih.gov/bax/nmrserver/mics
JRAMA+ Viewer:    http://spin.niddk.nih.gov/bax/software/MICS/JRAMA+/

DOWNLOAD

RedHat Linux (Fedora Core)/Mac/Win32 version (v 1.00, last updated Nov 23, 2011)

The download archive can be unpacked in unix by:

   tar -zxvf mics.tar.Z

or with traditional Windows zip software.
The shell script "install.com" in the package can be used to set up the program.


Contents

What is MICS?
Reliability of MICS
Components of the MICS System
How to Use MICS
Chemical Shift Input Format Used by MICS

What is MICS?

MICS (Motif Identification from Chemical Shifts) is a hybrid system for empirical prediction of distinct protein structural motifs, including N-terminal and C-terminal helix capping motifs and five types of beta-turns: I, II, I', II' and VIII, using a combination of six kinds (HN, HA, CA, CB, CO, N) of chemical shift assignments for a given residue sequence. Those structural motifs are well known to play a key role in stabilizing protein structure and likely to be important in the protein folding process. The MICS program is developed based on a systematic study of the NMR chemical shift (and amino acid sequence) patterns observed for each type of structural motif by using a database of proteins of known structure and known NMR chemical shifts. The chemical shifts, together with the PDB-extracted amino acid preference of the helix capping and beta-turn motifs, are then used as input data for training an artificial neural network algorithm, which outputs the statistical probability of finding each motif at any given position in the protein.

top

Reliability of MICS

MICS prediction from NMR chemical shifts has been trained and validated using a recently constructed database. The trained neural networks, contained in the MICS program, provide values ranging from ca 0.85-0.92 for the Matthews correlation coefficient (MCC) of its N-terminal and C-terminal helix capping motif predictions, and from ca 0.67-0.83 for the MCC of its five types of beta-turn predictions, which far exceed that attainable by other bioinformatics sequence analysis (MCC of <0.4-0.5). The trained neural networks were further validated using a smaller database which contains 11 new proteins not present in the training database. The MICS output assigns a normalized probability for each residue to participate in any of the specific motifs, or to be part of a regular element of secondary structure.

top

Components of the MICS System

The MICS program is implemented in C++ language, and includes a graphical interface to display the prediction results. The graphical interface, called jRAMA+, is implemented in the Java language.

The MICS system comprises two main scripts:

  1. MICS (script name: mics)
    Performs the secondary structure and structural motif classifications, then summarizes the predictions.
     
  2. jRAMA+ (script name: jrama+)
    Interactive display of the predictions (requires Java to be installed).
     

The mics script can be invoked with the -help command-line argument to generate a complete list of options. The jRAMA+ can be also run from a web browser with a proper Java plugin installed (see details here).

Other files of the MICS system include:

mics/mics
A master script to run the MICS prediction.

mics/jrama+
A master script to run the jRama+ viewer.

mics/demo
A directory with example chemical shift input data and scripts for a demo of MICS.

mics/tab/*level*.tab
The weighting factors and biases of the neural network used in the prediction process.

mics/tab/*.tab
The tables of random coil shifts, adjustments values from neighboring residues used in the prediction process (the same tables as used in TALOS).

mics/bin/MICS.*
The compiled MICS binary files for multiple platforms, such as Linux (MICS.linux), MacOS (MICS.mac), and WindowsXP (MICS.winxp).

mics/bin/rama.jar
The compiled JRAMA+ viewer for multiple platforms.

mics/com/bmrb2talos.com
An example utility script for file format conversion from BMRB-format to the required TALOS format, which can be copied to a working directory and used as needed.

top

How to Use MICS

Use of MICS is much the same as for the TALOS/TALOS+ program:

  1. Create a directory for the prediction session; all subsequent commands will be executed from this directory.

  2. Prepare the input table of shift assignments (for example "myshifts.tab"), according to the format given below.

  3. Run MICS (mics) to perform the predictions. Most commonly, this will simply require a command such as:

    mics -in myshifts.tab

    During the prediction, MICS will generate a single output file "predMICS.tab" to store the the normalized probability of finding each motif at any given position in the protein, respectively. An example (excerpt) of the MICS-predicted score file ("predMICS.tab" ) is provided below:

    VARS RESID RESNAME CS_CNT SS_CLASS Q_H Q_E Q_L Q_NCAP Q_CCAP Q_T1@2 Q_T2@2 Q_T1p@2 Q_T2p@2 Q_T8@2 S2
    FORMAT %4d %1s %2d %2d %1s %8.3f %8.3f %8.3f %8.3f %8.3f %8.3f %8.3f %8.3f %8.3f %8.3f %8.3f 
    
    2 Q 6 E 0.006 0.930 0.052 0.010 0.000 0.000 0.000 0.008 0.025 0.029 0.889
    3 I 6 E 0.006 0.815 0.173 0.005 0.000 0.000 0.010 0.005 0.000 0.015 0.900
    ...
    
    where RESID is the residue number, RESNAME its the one letter amino acid code, CS_CNT the number of chemical shift input data available for that residue. Q_H, Q_E, Q_L columns are the MICS-predicted scores of a given residue to be in a helix, strand, and coil, respectively, Q_NCAP (Q_CCAP) column is the MICS predicted scores of a given residue to be a NCap (Ccap) residue in a N-terminal (C-terminal) helix capping motif, Q_T1@2, Q_T2@2, Q_T1p@2, Q_T2p@2 and Q_T8@2 columns are the MICS-predicted scores of a given residue to be the second residue of a type I, II, I', II' or VIII beta-turn. The S2 column is the chemical shift based RCI-S2 value (Berjanskii MV and Wishart DS, 2005, J. Am. Chem. Soc. 127: 14970-14971).

     

  4. Run jRAMA+ (jrama+) script or web-based JRAMA+ to inspect the predictions. The simplest jRAMA+ invocations from the local computer are:

    jrama+ -in predMICS_norm.tab

    The jRama+ will read and display all MICS-predicted motifs and their scores. An example MICS display window is shown below:

    MICS window

    Where the upper panel is the predicted RCI-S2 value (Berjanskii MV and Wishart DS, 2005, J. Am. Chem. Soc. 127: 14970-14971), the second panel shows the predicted secondary structure (aqua, beta-sheet; red, helix) and the position of the Ncap (Ccap) residue in the MICS-predicted N-terminal (C-terminal) helix capping motif (yellow arrows), the third to the seventh panels show the MICS-predicted type I, II, I', II' and VIII beta-turns, respectively (blue bars, with solid color for the two center residues and transparent for the first and last residues). The heights of the bars and arrows correspond to the normalized probabilities assigned by MICS.


Chemical shift data pre-check

Similar to TALOS+, MICS includes a feature that pre-checks chemical shift referencing and possible chemical shift errors

   mics -in myshifts.tab -check

It checks the referencing for 13CA, 13CB, 1HA and 13C' chemical shifts, using the empirical correlation between certain sets of chemical shifts data (Wang et al., 2005 J Biol NMR, 32:13-22). The estimated chemical shift referencing offsets, as well as the chemical shifts which largely deviate from their expected ranges, will be printed with the following format:


   Chemical shift outlier checking...
     ...
     64 E CB Secondary Shift: -3.800 Limit: -3.765
     76 G  C Secondary Shift:  4.250 Limit:  1.925 !

   Chemical shift referencing checking...
      Estimated Referencing Offset for CA/CB: 0.795 +/- 0.104 ppm (Size: 66)

Note that (1) a chemical shift referencing correction is likely required whenever the estimated referencing error approaches the average uncertainty in the database chemical shifts (~1.0 ppm for 13CA/CB and 13C' shifts; ~0.3 ppm for 1HA shifts), and/or the estimated referencing error is larger than five times the reported referencing offset uncertainty; (2) chemical shift outliers, which fall far outside (>2-3 times of) the expected range of secondary chemical shifts (and marked by "!"), are unlikely to be correct (or like in the above example correspond to a C-terminal carboxylate instead of a backbone carbonyl) and need to be checked carefully.

MICS also uses an option "-offset" to automatically apply chemical shift offset correction if needed:

   mics -in myshifts.tab -offset


and an option "-iso" to apply 2H Isotope correction to CA/CB chemical shifts collected from a perdeuterated protein sample:

   mics -in myshifts.tab -iso

top

Chemical Shift Input Format Used by MICS

MICS use the same format for its chemical shift input as TALOS+. An example portion of the required shift table format is shown below. Full Example: ubiq.tab. Other examples can be found in the mics/demo directory of the MICS installation, or at the MICS Server site, more details regarding the required format can be found at the TALOS+ webpage:

Example shift table (excerpt):

   REMARK Ubiquitin input for TALOS, HA2/HA3 assignments arbitrary.

   DATA FIRST_RESID 1

   DATA SEQUENCE MQIFVKTLTG KTITLEVEPS DTIENVKAKI QDKEGIPPDQ QRLIFAGKQL
   DATA SEQUENCE EDGRTLSDYN IQKESTLHLV LRLRGG

   VARS   RESID RESNAME ATOMNAME SHIFT
   FORMAT %4d   %1s     %4s      %8.3f

     1 M           HA                  4.23
     1 M           C                 170.54
     1 M           CA                 54.45
     1 M           CB                 33.27
     2 Q           HN                  8.90
     2 Q           N                 123.22
     2 Q           HA                  5.25
     2 Q           C                 175.92
     2 Q           CA                 55.08
     2 Q           CB                 30.76
   ...
 


top


* All documents in PDF format require the free Adobe Acrobat Reader application for viewing

[ Home ] [ NIH ] [ NIDDK ] [ Terms of Use ]
last update: May 23 2012 / sy