How to Use TALOS-N
Use of TALOS-N is much the same as for TALOS and TALOS+:
Create a directory for the prediction session; all
subsequent commands will be executed from this directory.
Prepare the input table of chemical shift assignments
(for example "myshifts.tab
"), according to the
format given below .
Run TALOS-N (talosn
) to perform the
database searches. Most commonly, this will simply require a command
such as:
talosn -in myshifts.tab
During the database search, a summary file "predAll.tab
"
will be created to store the 25 best database matches for all residues
in the target protein. Before exiting, a file "pred.tab
"
will also be created, which includes an initial summary of the prediction
results. Additionally, three files "predAdjCS.tab
",
"predABP.tab
" and "predSS.tab
"
will be created to store the calculated secondary chemical shifts used for
prediction, the ANN-predicted 324-state φ/ψ distribution
information and the predicted secondary structure, respectively.
The database search will typically take about 100 sec per 100 residues.
Run RAMA (jrama
) or
web
based jRAMA to inspect and adjust the predictions. The simplest
RAMA invocations are:
jrama -in myshifts.tab
jrama -in myshifts.tab -ref mystruct.pdb
During this inspection, you will:
Examine the φ/ψ distributions of the center residues
of the best 25 database matches for a given query residue, and
decide which ones should be included in the prediction, and which
are "outliers". (NOTE: in the vast majority of
cases, the initial automated classifications performed by the current
version of the TALOS-N program should be acceptable with no manual
adjustment needed).
Classify the results for a given residue as "Strong",
"Generous", "Ambiguous", or (if a reference
structure is known) "Bad".
The file "predAll.tab
" will
be adjusted along the way to reflect any changes made interactively, and
a new "pred.tab" summary file will be created on exiting. When
the above steps are completed, the final "pred.tab
"
file will include the classification ("Strong", "Generous",
etc) and predictions (averages and standard deviations) for φ and ψ
at each residue.
Convert TALOS-N results to other formats, for use as
structural restraints, etc. TALOS-N package includes shell scripts such as
"talosn2dyana.com
" and "talosn2xplor.com
" for this purpose, examples for using them are:
$TALOSN_DIR/com/talosn2dyana.com pred.tab > talos.aco
$TALOSN_DIR/com/talosn2xplor.com pred.tab > talos.tbl
jRAMA
offers similar features in its menu bar ("Tools").
Default criteria of angle contraints conversion: For "Strong"
predictions, the φ and ψ angles are set to <φ> +/- 2sd and
<ψ> +/- 2sd, where <φ> and <ψ> are the averaged TALOS-N predictions,
and 2sd is the larger of 20 deg or two standard deviations of the TALOS-N prediction.
For "Generous" predictions, the φ and ψ angles are set to <φ> +/- 3sd
and <ψ> +/- 3sd, where <φ> and <ψ> are the averaged TALOS-N
predictions, and 3sd is the larger of 30 deg or three standard deviations of the
TALOS-N prediction.
Chemical shift data pre-check
Similar to TALOS+, TALOS-N includes a feature that pre-checks
chemical shift referencing and possible chemical shift errors
talosn -in myshifts.tab -check
It checks the referencing for 13 Cα,
13 Cβ, 1 Hα and 13 C' chemical shifts,
using the empirical correlation between certain sets of chemical shifts data
(Wang et al., 2005 J Biol NMR, 32:13-22). The estimated chemical shift
referencing offsets, as well as the chemical shifts which largely deviate
from their expected ranges, will be printed with the following format:
Chemical shift outlier checking...
...
64 E CB Secondary Shift: -3.800 Limit: -3.765
76 G C Secondary Shift: 4.250 Limit: 1.925 !
Chemical shift referencing checking...
Estimated Referencing Offset for CA/CB: 0.795 +/- 0.104 ppm (Size: 66)
Note that (1) a chemical shift referencing correction
is likely required when ever the estimated referencing error approaches
the average uncertainty in the database chemical shifts (~1.0 ppm for
13 Cα/Cβ and 13 C' shifts; ~0.3 ppm for
1 Hα shifts), and/or the estimated referencing error
larger than five times the average fitting errors; (2) chemical shift
outliers, which fall far outside (>2-3 times of) the expected range
of secondary chemical shifts (and marked by "!"), are unlikely
to be correct (or like in the above example correspond to a C-terminal
carboxylate instead of a backbone carbonyl) and need to be checked
carefully.
TALOS-N uses an option "-offset
"
to automatically apply chemical shift offset correction if needed:
talosn -in myshifts.tab -offset
and an option "-iso
" to apply 2 H Isotope
correction to 13 Cα/13 Cβ chemical shifts
collected from a perdeuterated protein sample:
talosn -in myshifts.tab -iso
Exclusion of proteins from the database
Excluding one or more proteins from the database during
the TALOS-N database search can be performed by a command line such as:
talosn -in myshifts.tab -excl name1 name2 ...
where "name1
" and "name2
"
etc. are the names of the proteins to be excluded (see the valid protein
names in the database "talos.tab
").
Amino acid sequence based proten secondary structure prediction
By default, the amino acid sequence based proten secondary structure
prediction module is seamlessly implemented in TALOS-N as a complement to the chemical
shift based module and can bridge stretches in proteins that lack chemical shifts. This
amino acid sequence based module can be performed separately by a command line such as:
talosn_ss my_sequence.fasta
where "my_sequence.fasta
" is the sequence input file with a standard
FASTA format. Please check the "talosn_ss
" script for all requirements
in order to run this module.
top