TALOS-N: Prediction of Protein Backbone and Sidechain Torsion Angles from NMR Chemical Shifts

As described in the paper:
   Protein backbone and sidechain torsion angles predicted from NMR chemical shifts using artificial neural networks.
   Yang Shen, and Ad Bax, J. Biomol. NMR, 56, 227-241(2013). (doi)

Contact:    ShenYang@niddk.nih.gov    Bax@nih.gov
Web:    http://spin.niddk.nih.gov/bax/software/TALOS-N
Server:    http://spin.niddk.nih.gov/bax/nmrserver/talosn
JRAMA Viewer:    http://spin.niddk.nih.gov/bax/software/TALOS-N/JRAMA/
Download and Installation

A stable version TALOS-N software package can be downloaded below. When downloading software from this website, you are agreeing to our Terms of Use, including the terms that there is no right to privacy on this system, and that the software from this website is not to be redistributed without permission from the authors. The TALOS-N package provides the hardware & OS versions of linux, linux9, winxp and mac (see here for a definition of those hardware & OS versions by the NMRPipe system), and requires at least ~0.5 GB memory (to load the required library).

The most common TALOS-N installation procedure on an unix system (linux, linux9, and mac) will involve:
  1. Create a directory for the TALOS-N installation [for example, type mkdir /disk1/talosn in an "xterm" terminal window].
  2. Go to the selected install directory [cd /disk1/talosn].
  3. Download and put the TALOS-N installation files (talosn.tZ and install.com) into the selected install directory
    • Via a web browser: Right-click the download links below and select "Save Target As", "Save Link As" or "Download Linked File (As)" (depending on the browser type), and save the files into the selected install directory (Be sure to retain the exact file name shown below).
    • Or via the unix command "wget":
      wget http://spin.niddk.nih.gov/bax/software/TALOS-N/talosn.tZ
      wget http://spin.niddk.nih.gov/bax/software/TALOS-N/install.com
  4. Execute the install.com script; in most cases, no arguments will be required; it will be sufficient to make the install scripts executable, then run the install.com script. Note: Use the command ./install.com +help to generate a list of install command-line options.
  5. The installation will automatically search for an installed NMRPipe system, and patch the NMRPipe system by adding the TALOS-N installation and initialization information.
  6. If no installed NMRPipe system is detected, the installation will generate a "stand-alone" initialization script talosn_init.com, and recommend a common way to apply the initialization, i.e., adding the following lines to the ~/.cshrc file:
    if (-e /disk1/talosn/talosn_init.com) then
       source /disk1/talosn/talosn_init.com
  7. Note: The install.com script dose not work for a Windows installation. However, Windows users are still able to use the TALOS-N program after a simple manual installation, For details, please check the help message in the install.com script or the message from running a command ./install.com +help.

    Note: for NMRPipe users, a separated installation of TALOS-N as above is not needed, TALOS-N is already installed in together with the NMRPipe package.

There is also a Web-Based version of TALOS-N which can be used directly without installing TALOS-N. A Java version viewer (JRAMA) is also available to display TALOS-N results without installing TALOS-N. You can access this Web-based system, along with other facilities for manipulating chemical shifts, dipolar couplings, and molecular structures at the Bax Group NMR Server site:

TALOS-N Installation Files

TALOS-N Web Server

(Version 4.12 Rev 2015.147.15.40)

talosn.tZ [size: 131MB]
install.com [size: 13KB]



What is TALOS-N?
Reliability of TALOS-N
Components of the TALOS-N System
How to Use TALOS-N
Chemical Shift Input Format Used by TALOS-N
Inspecting and Refining the Prediction Results
How to Select Consistent Predictions


TALOS-N is an artificial neural network (ANN) based hybrid system for empirical prediction of protein backbone φ/ψ torsion angles, sidechain χ1 torsion angles and secondary structure using a combination of six kinds (HN, Hα, Cα, Cβ, CO, N) of chemical shift assignments for a given residue sequence.

The original TALOS approach, and its successor TALOS+, is an extension of the well-known observation that many kinds of secondary chemical shifts (i.e. differences between chemical shifts and their corresponding random coil values) are highly correlated with aspects of protein secondary structure. The goal of TALOS-N is again to use secondary chemical shift and sequence information to make quantitative predictions for the protein backbone angles φ/ψ, and to provide a measure of the uncertainties in these predictions. In the original TALOS approach, we search a high-resolution structural database (for which experimental chemical shifts are available) for the 10 best matches to the secondary chemical shifts of a given residue in a target protein along with its two flanking neighbors (a residue triplet). If there is a consensus of φ and ψ angles among the 10 best database matches, then we use these database triplet structures to form a prediction for the backbone angles of the target residue. The later TALOS+ approach added an ANN classification scheme to this database mining approach. This ANN analyzed the chemical shifts and sequence to estimate the likelihood of a given residue being in a α, β, or positive-φ conformation. This ANN classification information was then combined with the database mining results, thereby increasing the number of residues where useful backbone angle predictions can be made.

TALOS-N relies far more extensively on the use of trained ANNs than TALOS+. In TALOS-N method, the ANN used to correlate the chemical shift and the backbone conformation is implemented upon a concept of defining the Ramachandran map in terms of 324 voxels, rather than the three groupings used by TALOS+. TALOS-N also improves upon the original TALOS and TALOS+ database mining approaches by relying on (1) a large database of over 9500 high quality X-ray structures to which chemical shift assignments were added by SPARTA+, and (2) an optimized database search procedure for 25 best matched database hepta-peptides (rather than 10 best matched database tri-peptides). The far greater reliance on ANN algorithms, as well as an optimized database mining approach, allows TALOS-N to predicting backbone torsion angles for a larger fraction (~90%) of residues in a given protein at improved precision.

TALOS-N also includes an ANN component to derive sidechain χ1 angle information, as the χ1 value is known to impact the backbone chemical shifts.

In addition, TALOS-N offers several important features:

  1. TALOS-N can make predictions for the frequently encountered cases where residue assignments are lacking. Although the fraction of such residues for which unambiguous predictions can be made tends to be significantly lower, the reliability of such predictions remains relatively high.

  2. For convenience, and in order to prevent assignment of backbone torsion angles to regions that are dynamically disordered, TALOS-N also reports an estimated backbone order parameter S2 derived from the chemical shifts in a way described by Berjanskii and Wishart (J. Am. Chem. Soc. 127: 14970-14971).

  3. For those residues whose backbone torsion angles cannot be predicted uniquely by TALOS-N, but whose backbone is not dynamically disordered as judged by RCI-S2, the ANN predicted 324-state (φ,ψ) distribution frequently strongly limits the chemical shift compatible φ/ψ values to two small, discrete regions of the Ramachandran map, which may prove useful in structure determination efforts.

  4. TALOS-N provides ANN-predicted secondary structure information from the chemical shifts (and/or protein sequence), with high prediction accuracy.


TALOS-N flowchart

A flowchart for TALOS-N database search procedure

Reliability of TALOS-N
Reliability of backbone φ/ψ predictions

As with TALOS/TALOS+, the reliability of the TALOS-N approach was tested by a cross-validation "leave-one-out" procedure where each protein was removed from the database, and its φ/ψ angles were predicted using the remaining protein data. For the purposes of testing, a prediction is considered as consistent ("Strong" or "Generous") if it falls in a single cluster region of the Ramachandran map. A prediction is considered "Bad" or incorrect if it significantly deviates from the observed φ/ψ angles from the crystal structure (see definition here). According to the tests:

  1. TALOS-N makes consistent predictions for, on average, for about 90% of the residues. Importantly, the majority of those consistent predictions (~87% of the residues) are shown as "Strong", when all 25 best database matches are well-clustered. Predictions are classified as "Generous", when only the top 10 best database matches cluster in a narrow region.

  2. (IMPORTANT!) Over all 580 database proteins, about 3.5% of the unambiguous predictions made by TALOS-N were "Bad" relative to the corresponding crystal structure, using an acceptance criterion that is nearly two-fold tighter than that used previously. However, a substantial fraction of this 3.5% appears to reflect genuine differences relative to the crystalline state, and the true error rate therefore is believed to be considerably lower.

  3. On average, the rmsd as reported by TALOS-N for the consensus predictions was 8.7 degrees for φ, and 8.5 degrees for ψ.

  4. The actual RMSD of the "correct" predictions relative to the crystal structures was 12.3 degrees for φ, and 12.1 degrees for ψ (which includes the uncertainty in the X-ray derived angles).

As noted in (2) above, it must be remembered that TALOS-N will produce a small number of predictions which seem to be valid (because the best matches from the database are consistent) but which are nevertheless in error.

Reliability of sidechain χ1 predictions

Analysis of X-ray derived protein structures in terms of χ1 angles is complicated by the fact that many residues are subject to rotameric averaging, with commonly only a single conformer represented in the X-ray structures. TALOS-N can identify the chemical shift signature of a given χ1 rotamer for about 50% of the residues, all corresponding to cases where no extensive rotamer averaging is taking place. When just considering β-branched residues (Ile, Val, Thr) predictability increases to over 80% but, conversely, predictability of hydrophilic residues such as Lys, Arg, Glu, Asn, His and Ser as well as the highly flexible Met sidechain falls below 25%, on average.

Reliability of secondary structure predictions

The secondary structure performance of TALOS-N is virtually identical to that of TALOS+, with a Q3 score of ~88% when evaluated over the validation set, which likely approaches the limit of what is achievable when considering that even for proteins of known structure different programs typically show agreement no better than 90%.

TALOS-N also includes a network that is trained to predict secondary structure from the amino acid sequence alone. It yields a Q3 score of ~81% for the CASP9 target proteins, making it comparable in performance with the upper limit of 80-82% Q3 scores reported by other popular bioinformatics programs. Importantly, this amino acid sequence based module is seamlessly implemented in TALOS-N as a complement to the chemical shift based module and can bridge stretches in proteins that lack chemical shifts.

It should also be noted that the tests above included only the most well-defined parts of each protein; roughly 6% of the residues had first been removed because they had high B factors (exceeding 1.5 times the average B-factor for that protein) in the crystal structure or because they were known to be highly mobile in solution. Evaluation of the results indicates that many of the "Bad" predictions occur outside of regions of secondary structure, where the X-ray and solution structures may actually differ from one another, as evidenced by large differences between X-ray structures when multiple such structures are available for the same protein. Therefore, the accuracy of TALOS-N will vary from protein to protein, and tends to be lower for proteins with large flexible regions. A partial remedy is to increase the S2 threshold for "dynamic" residues to 0.65, but this will decrease the number of consensus predictions made.


Components of the TALOS-N System

The TALOS-N core system is implemented in the C++ language, and includes a graphical interface to inspect the prediction results. The graphical interface, called jRAMA, is implemented in Java.

There are two major scripts comprising the TALOS-N system:

  1. TALOS-N (script name: talosn)
    Performs the φ/ψ/χ1 torsion angles predictions and secondary structure classifications, then summarizes the predictions.

  2. RAMA (script name: jrama)(or an alternative web based jRAMA applet)
    Interactive display and refinement of the predictions (Java platform required).

Both of these scripts can be invoked with the -help command-line argument to generate a complete list of options.

Other files of the TALOS-N system include:

talosn/demo A directory with example chemical shift input data and scripts for a demo of TALOS-N.
talosn/tab A directory with all required parameter sets of TALOS-N:
talos.tab The compiled 9523-protein database of residues with their corresponding SPARTA+ calculated secondary shifts and observed φ/ψ/χ1 values.
talos.obsCS.tab The compiled 580-protein database of residues with their corresponding experimental secondary shifts and observed φ/ψ/χ1 values.
randcoil.tab The table of random coil shifts used in the prediction process.
rc*.tab The neighboring residue correction tables of random coil shifts.
*level*.tab The weighting factors and biases of the neural network used in the prediction process.
talosn/bin: The compiled TALOS-N binary files for multiple platforms, such as Linux (TALOSN.linux, TALOSN.linux9, TALOSN.linux9_x64), MacOS (TALOSN.mac), and Windows (TALOSN.winxp).
talosn/com Contains some example utility scripts for file format conversion, which can be copied to a working directory and used as needed:
talos2dyana.com This Unix shell script generates a Dyana/Cyana format torsion angle restraint file from a standard TALOS+/TALOS-N output file.
talos2xplor.com This Unix shell script generates an XPLOR format torsion angle restraint file from a standard TALOS+/TALOS-N output file.
talos2xplor.tcl This tcl script is an alternative to talos2xplor.com, and requires an NMRPipe system
bmrb2talos.com This Unix shell script converts NMR-Star format shifts to TALOS input format.
talosn_ss Master script for a (fasta) sequence based protein secondary structure prediction (a Psi-Blast installation is required).

The standard NMRPipe installation also includes scripts star2cs.tcl and shift2tab.tcl for converting NMR-Star and PIPP format shifts to TALOS input format.


How to Use TALOS-N

Use of TALOS-N is much the same as for TALOS and TALOS+:

  1. Create a directory for the prediction session; all subsequent commands will be executed from this directory.

  2. Prepare the input table of chemical shift assignments (for example "myshifts.tab"), according to the format given below.

  3. Run TALOS-N (talosn) to perform the database searches. Most commonly, this will simply require a command such as:

    talosn -in myshifts.tab

    During the database search, a summary file "predAll.tab" will be created to store the 25 best database matches for all residues in the target protein. Before exiting, a file "pred.tab" will also be created, which includes an initial summary of the prediction results. Additionally, three files "predAdjCS.tab", "predABP.tab" and "predSS.tab" will be created to store the calculated secondary chemical shifts used for prediction, the ANN-predicted 324-state φ/ψ distribution information and the predicted secondary structure, respectively. The database search will typically take about 100 sec per 100 residues.

  4. Run RAMA (jrama) or web based jRAMA to inspect and adjust the predictions. The simplest RAMA invocations are:

    jrama -in myshifts.tab
    jrama -in myshifts.tab -ref mystruct.pdb

    During this inspection, you will:

    • Examine the φ/ψ distributions of the center residues of the best 25 database matches for a given query residue, and decide which ones should be included in the prediction, and which are "outliers". (NOTE: in the vast majority of cases, the initial automated classifications performed by the current version of the TALOS-N program should be acceptable with no manual adjustment needed).
    • Classify the results for a given residue as "Strong", "Generous", "Ambiguous", or (if a reference structure is known) "Bad".

    The file "predAll.tab" will be adjusted along the way to reflect any changes made interactively, and a new "pred.tab" summary file will be created on exiting. When the above steps are completed, the final "pred.tab" file will include the classification ("Strong", "Generous", etc) and predictions (averages and standard deviations) for φ and ψ at each residue.

  5. Convert TALOS-N results to other formats, for use as structural restraints, etc. TALOS-N package includes shell scripts such as "talosn2dyana.com" and "talosn2xplor.com " for this purpose, examples for using them are:

    $TALOSN_DIR/com/talosn2dyana.com pred.tab > talos.aco
    $TALOSN_DIR/com/talosn2xplor.com pred.tab > talos.tbl 
    jRAMA offers similar features in its menu bar ("Tools").

    Default criteria of angle contraints conversion: For "Strong" predictions, the φ and ψ angles are set to <φ> +/- 2sd and <ψ> +/- 2sd, where <φ> and <ψ> are the averaged TALOS-N predictions, and 2sd is the larger of 20 deg or two standard deviations of the TALOS-N prediction. For "Generous" predictions, the φ and ψ angles are set to <φ> +/- 3sd and <ψ> +/- 3sd, where <φ> and <ψ> are the averaged TALOS-N predictions, and 3sd is the larger of 30 deg or three standard deviations of the TALOS-N prediction.

Chemical shift data pre-check

Similar to TALOS+, TALOS-N includes a feature that pre-checks chemical shift referencing and possible chemical shift errors

talosn -in myshifts.tab -check

It checks the referencing for 13Cα, 13Cβ, 1Hα and 13C' chemical shifts, using the empirical correlation between certain sets of chemical shifts data (Wang et al., 2005 J Biol NMR, 32:13-22). The estimated chemical shift referencing offsets, as well as the chemical shifts which largely deviate from their expected ranges, will be printed with the following format:

   Chemical shift outlier checking...
     64 E CB Secondary Shift: -3.800 Limit: -3.765
     76 G  C Secondary Shift:  4.250 Limit:  1.925 !

   Chemical shift referencing checking...
      Estimated Referencing Offset for CA/CB: 0.795 +/- 0.104 ppm (Size: 66)

Note that (1) a chemical shift referencing correction is likely required when ever the estimated referencing error approaches the average uncertainty in the database chemical shifts (~1.0 ppm for 13Cα/Cβ and 13C' shifts; ~0.3 ppm for 1Hα shifts), and/or the estimated referencing error larger than five times the average fitting errors; (2) chemical shift outliers, which fall far outside (>2-3 times of) the expected range of secondary chemical shifts (and marked by "!"), are unlikely to be correct (or like in the above example correspond to a C-terminal carboxylate instead of a backbone carbonyl) and need to be checked carefully.

TALOS-N uses an option "-offset" to automatically apply chemical shift offset correction if needed:

talosn -in myshifts.tab -offset

and an option "-iso" to apply 2H Isotope correction to 13Cα/13Cβ chemical shifts collected from a perdeuterated protein sample:

talosn -in myshifts.tab -iso

Exclusion of proteins from the database

Excluding one or more proteins from the database during the TALOS-N database search can be performed by a command line such as:

talosn -in myshifts.tab -excl name1 name2 ...

where "name1" and "name2" etc. are the names of the proteins to be excluded (see the valid protein names in the database "talos.tab").

Amino acid sequence based proten secondary structure prediction

By default, the amino acid sequence based proten secondary structure prediction module is seamlessly implemented in TALOS-N as a complement to the chemical shift based module and can bridge stretches in proteins that lack chemical shifts. This amino acid sequence based module can be performed separately by a command line such as:

talosn_ss my_sequence.fasta

where "my_sequence.fasta" is the sequence input file with a standard FASTA format. Please check the "talosn_ss" script for all requirements in order to run this module.


Chemical Shift Input Format Used by TALOS-N

TALOS-N requires an input chemical shift table of standard nmrPipe/TALOS format. An example portion of the required chemical shift table format is shown below (full example: ubiq.tab). Other examples can be found in the talosn/demo directories, or at the TALOS-N Server site. Specifically:

  • The TALOS chemical shift table uses the general-purpose NMRPipe table format.

  • 13C chemical shifts for Cα, Cβ, and CO used as input for TALOS/TALOS+/TALOS-N should be referenced relative to TSP. The 15N chemical shifts used as input for TALOS/TALOS+/TALOS-N should be referenced relative to liquid ammonia at 25 degrees C.

  • Use the optional DATA FIRST_RESID line to specify the first residue ID number of the sequence. If it is not specified, residue numbering is assumed to begin at 1.

  • The protein sequence should be given as shown, using one or more DATA SEQUENCE lines. Space characters in the sequence will be ignored. Use "c" for oxidized CYS (Cβ ~ 42.5 ppm) and "C" for reduced CYS (Cβ ~ 28 ppm), "h" for protonated HIS and "H" for deprotonated HIS, in both the sequence header and the shift table. Use X for residues other than the usual 20 amino acids.

  • The table must include columns for residue ID, one-character residue name, atom name, and chemical shift.

  • The table must include a "VARS" line which labels the corresponding columns of the table.

  • The table must include a "FORMAT" line which defines the data type of the corresponding columns of the table.

  • Atom names are always given exactly as:
    HA for Hα of all residues except glycine
    HA2 for the first Hα of glycine residues
    HA3 for the second Hα
    C for C' (CO)
    CA for Cα
    CB for Cβ
    N for N-amide
    HN for H-amide
  • As noted, there is an exception for naming Gly assignments, which should use HA2 and HA3 instead of HA. In the case of Gly HA2/HA3 assignments, TALOS/TALOS+/TALOS-N will use the average value of the two, so that it is not necessary to have these assigned stereo specifically; for use of TALOS/TALOS+/TALOS-N, the assignment can be arbitrary. Note however that the assignment must be given exactly as either "HA2" or "HA3" rather than "HA2|HA3" etc.

  • Other types of assignments may be present in the chemical shift table; they will be ignored.

  • TALOS-N now also has the option to use chemical shift input in the BMRB NMR-Star format. If NMR-Star format input is used, the input must contain shifts for a single protein chain only. It must also contain complete sequence information for the protein. Specifically, the NMR-Star format table must contain a sequence section with _Residue_seq_code and _Residue_label values, and a chemical shift section with values for _Residue_seq_code _Residue_label _Atom_name _Atom_type and _Chem_shift_value.  Example: ubiq_bmr6457_1D3Z.str.

Example shift table (excerpt):

REMARK Ubiquitin input for TALOS, HA2/HA3 assignments arbitrary.



   FORMAT %4d   %1s     %4s      %8.3f

     1 M           HA                  4.23
     1 M           C                 170.54
     1 M           CA                 54.45
     1 M           CB                 33.27
     2 Q           HN                  8.90
     2 Q           N                 123.22
     2 Q           HA                  5.25
     2 Q           C                 175.92
     2 Q           CA                 55.08
     2 Q           CB                 30.76

Inspecting and Refining the Prediction Results

The final step in interpreting the results of the TALOS-N database search is to inspect and classify the matches so that useful predictions can be formed; however, in most cases, the initial automated classifications performed by the current version of the TALOS-N program should be acceptable with no manual adjustment needed.

Refinement of predictions can be made via the RAMA graphical interface jrama, which is included in the package, or a web-based Java version of the RAMA Viewer (JRAMA). The simplest invocation of jrama is:

jrama -in pred.tab

If a proposed structure is available, first run TALOS-N with it to generate a prediction summary:

talosn -in myshifts.tab -ref mystruct.pdb

Then, invoke RAMA so that the reference structure is included in the display of prediction data:

jrama -in pred.tab -ref mystruct.pdb

The various windows displayed by jrama are shown below.

TALOS Sequence Window

Sequence Window: displays the target protein sequence, with each residue colored according to its classification. Clicking on a residue with the mouse will select that residue for display and analysis in the other windows. The residues are colored according to this scheme:
Light Green Strong uambiguous prediction (no outlier among best 25 database matches)
Dark Green Generous unambiguous prediction (no outlier among best 10 database matches)
Yellow Ambiguous; no prediction
Blue Dynamic; no prediction
Red Bad prediction relative to a known structure
Gray No classification yet

TALOS Ramachandran window

Ramachandran window: graphs the φ/ψ distributions of the 25 best database matches for the currently selected residue. It also displays the average and standard deviation of φ and ψ for those matches which are selected (i.e. included in the prediction), as well as filled semi-transparent 20o x20o voxels on the Ramachandran map, depicting the ANN-predicted probability to find any given residue in the φ/ψ regions defined by those voxels. (Note that only those voxels that are at least one standard deviation above the average predicted voxel density are displayed) The shaded region of the map shows the most populated regions of the TALOS-N database for the residue type in question.

In the graph, each match from the database is drawn as a small square at a particular φ/ψ coordinate. The individual squares can be toggled by a mouse click, to include or remove the corresponding match from the prediction. The squares are colored according to this scheme:
Light Green This match is included in the "Strong" prediction.
Dark Green This match is included in the "Generous" prediction.
Red Outlier; not included in the prediction
Blue Reference (phi/psi taken from "-ref" structure)

The Ramachandran window also includes buttons to reclassify the overall prediction as "Good(Strong)", "Good(Generous)", "Ambiguous", etc., and to move to the next or previous residue in the sequence.

TALOS Secondary Structure Prediction Window

Secondary Structure and RCI-S2 Prediction Window: graphs the predicted order parameter S2 (upper panel) and ANN-predicted secondary structure (lower panel; aqua, beta-sheet; red, helix) for all residues. The height of the bars reflects the probability of the neural network secondary structure prediction. The RCI-S2 value and the probabilities of the 3-state [helix|sheet|loop] secondary structure prediction for the current residue (indicated by yellow vertical lines) are labeled above the corresponding panel, followed by the S2 and secondary structure probabilities for the "cursor-activated" residue (indicated by white vertical lines).

For more about the RCI method for predicting order parameter from chemical shifts, see:
Berjanskii MV and Wishart DS (2005) A simple method to predict protein flexibility using secondary chemical shifts. J. Am. Chem. Soc. 127: 14970-14971

TALOS Prediction Window

χ1 Rotameric State Prediction Window: displays the predicted sidechain χ1 rotameric state for the currently selected residue in the target protein. The probabilities of the 3-state [gauche-|gauche+|trans] χ1 conformation prediction for the current residue (indicated by yellow vertical line) are labeled above the panel, followed by the probabilities for the "cursor-activated" residue (indicated by white vertical lines). Red, green and yellow ovals correspond to χ1 predictions of g-, g+, and t, respectively, with the height of the ovals reflecting the probability of the prediction.

TALOS Secondary Shift Window

Secondary Shift Window: graphs the secondary chemical shift distributions of the 25 best database matches for the currently selected residue (green), and of the currently selected residue in the target protein (blue). This window is disabled by default, and can be only activated from the "Display" menu of the Sequence Window.


How to Select Consistent Predictions in TALOS-N

The original TALOS rules for defining consistent ("Good") predictions were based on clustering of at least 9 out of the 10 best database matches in the same region of the Ramachandran map. While the TALOS+ rules were based on clustering of all 10 out of the 10 best database matches in the same region of the Ramachandran map. The TALOS-N now searches for 25 best database matches and makes two different types of consistent predictions, called "Strong" and "Generous" predictions. The TALOS-N rules for defining consistent predictions are:

  1. for the "Strong" consistent predictions, all 25 best database matches fall in a "consistent" region of the Ramachandran map; otherwise,
  2. if only top 10 best database matches cluster in the same region of the Ramachandran map, a "Generous" consistent prediction is made.

All other cases are considered "Ambiguous". Note that all the cases with predicted S2 value <0.6 are likely to be "Dynamic", and will not be considered as unambiguous predictions.

When a reference structure is available, predictions will be flagged as "Bad" (automatically by TALOS-N) if either of the following conditions applies:

|Phi(obs) - Phi(pred)|2 + |Psi(obs) - Psi(pred)|2 > 60*60

In practice, this usually means that the standard deviation of φ and ψ for the selected group of matches will be 35 degrees or less (12-13 degrees on average).

When inspecting the φ/ψ graphs to decide if matches are in a consistent region, keep in mind their "periodic" nature; i.e. angles at one edge of the graph are actually close to angles at the opposite edge.


* All documents in PDF format require the free Adobe Acrobat Reader application for viewing

[ Home ] [ NIH ] [ NIDDK ] [ Terms of Use ]

last update: May 27 2020 / sy