Modeling of loops in protein structures

Andras Fiser, R. Kinh Gian Do, A. Sali

Research output: Contribution to journalArticle

1327 Citations (Scopus)

Abstract

Comparative protein structure prediction is limited mostly by the errors in alignment and loop modeling. We describe here a new automated modeling technique that significantly improves the accuracy of loop predictions in protein structures. The positions of all nonhydrogen atoms of the loop are optimized in a fixed environment with respect to a pseudo energy function. The energy is a sum of many spatial restraints that include the bond length, bond angle, and improper dihedral angle terms from the CHARMM-22 force field, statistical preferences for the main-chain and side-chain dihedral angles, and statistical preferences for nonbonded atomic contacts that depend on the two atom types, their distance through space, and separation in sequence. The energy function is optimized with the method of conjugate gradients combined with molecular dynamics and simulated annealing. Typically, the predicted loop conformation corresponds to the lowest energy conformation among 500 independent optimizations. Predictions were made for 40 loops of known structure at each length from 1 to 14 residues. The accuracy of loop predictions is evaluated as a function of thoroughness of conformational sampling, loop length, and structural properties of native loops. When accuracy is measured by local superposition of the model on the native loop, 100, 90, and 30% of 4-, 8-, and 12-residue loop predictions, respectively, had <2 Å RMSD error for the mainchain N, C(α), C, and O atoms; the average accuracies were 0.59 ± 0.05, 1.16 ± 0.10, and 2.61 ± 0.16 Å respectively. To simulate real comparative modeling problems, the method was also evaluated by predicting loops of known structure in only approximately correct environments with errors typical of comparative modeling without misalignment. When the RMSD distortion of the main-chain stem atoms is 2.5 Å, the average loop prediction error increased by 180, 25, and 3% for 4-, 8-, and 12-residue loops, respectively. The accuracy of the lowest energy prediction for a given loop can be estimated from the structural variability among a number of low energy predictions. The relative value of the present method is gauged by (1) comparing it with one of the most successful previously described methods, and (2) describing its accuracy in recent blind predictions of protein structure. Finally, it is shown that the average accuracy of prediction is limited primarily by the accuracy of the energy function rather than by the extent of conformational sampling.

Original languageEnglish (US)
Pages (from-to)1753-1773
Number of pages21
JournalProtein Science
Volume9
Issue number9
StatePublished - 2000
Externally publishedYes

Fingerprint

Proteins
Molecular Dynamics Simulation
Atoms
Dihedral angle
Conformations
Sampling
Chemical bonds
Bond length
Simulated annealing
Molecular dynamics
Structural properties

Keywords

  • Comparative or homology protein structure modeling
  • Loop modeling

ASJC Scopus subject areas

  • Biochemistry

Cite this

Fiser, A., Kinh Gian Do, R., & Sali, A. (2000). Modeling of loops in protein structures. Protein Science, 9(9), 1753-1773.

Modeling of loops in protein structures. / Fiser, Andras; Kinh Gian Do, R.; Sali, A.

In: Protein Science, Vol. 9, No. 9, 2000, p. 1753-1773.

Research output: Contribution to journalArticle

Fiser, A, Kinh Gian Do, R & Sali, A 2000, 'Modeling of loops in protein structures', Protein Science, vol. 9, no. 9, pp. 1753-1773.
Fiser A, Kinh Gian Do R, Sali A. Modeling of loops in protein structures. Protein Science. 2000;9(9):1753-1773.
Fiser, Andras ; Kinh Gian Do, R. ; Sali, A. / Modeling of loops in protein structures. In: Protein Science. 2000 ; Vol. 9, No. 9. pp. 1753-1773.
@article{59757a9b7c524702a9b06b97ff5b78aa,
title = "Modeling of loops in protein structures",
abstract = "Comparative protein structure prediction is limited mostly by the errors in alignment and loop modeling. We describe here a new automated modeling technique that significantly improves the accuracy of loop predictions in protein structures. The positions of all nonhydrogen atoms of the loop are optimized in a fixed environment with respect to a pseudo energy function. The energy is a sum of many spatial restraints that include the bond length, bond angle, and improper dihedral angle terms from the CHARMM-22 force field, statistical preferences for the main-chain and side-chain dihedral angles, and statistical preferences for nonbonded atomic contacts that depend on the two atom types, their distance through space, and separation in sequence. The energy function is optimized with the method of conjugate gradients combined with molecular dynamics and simulated annealing. Typically, the predicted loop conformation corresponds to the lowest energy conformation among 500 independent optimizations. Predictions were made for 40 loops of known structure at each length from 1 to 14 residues. The accuracy of loop predictions is evaluated as a function of thoroughness of conformational sampling, loop length, and structural properties of native loops. When accuracy is measured by local superposition of the model on the native loop, 100, 90, and 30{\%} of 4-, 8-, and 12-residue loop predictions, respectively, had <2 {\AA} RMSD error for the mainchain N, C(α), C, and O atoms; the average accuracies were 0.59 ± 0.05, 1.16 ± 0.10, and 2.61 ± 0.16 {\AA} respectively. To simulate real comparative modeling problems, the method was also evaluated by predicting loops of known structure in only approximately correct environments with errors typical of comparative modeling without misalignment. When the RMSD distortion of the main-chain stem atoms is 2.5 {\AA}, the average loop prediction error increased by 180, 25, and 3{\%} for 4-, 8-, and 12-residue loops, respectively. The accuracy of the lowest energy prediction for a given loop can be estimated from the structural variability among a number of low energy predictions. The relative value of the present method is gauged by (1) comparing it with one of the most successful previously described methods, and (2) describing its accuracy in recent blind predictions of protein structure. Finally, it is shown that the average accuracy of prediction is limited primarily by the accuracy of the energy function rather than by the extent of conformational sampling.",
keywords = "Comparative or homology protein structure modeling, Loop modeling",
author = "Andras Fiser and {Kinh Gian Do}, R. and A. Sali",
year = "2000",
language = "English (US)",
volume = "9",
pages = "1753--1773",
journal = "Protein Science",
issn = "0961-8368",
publisher = "Cold Spring Harbor Laboratory Press",
number = "9",

}

TY - JOUR

T1 - Modeling of loops in protein structures

AU - Fiser, Andras

AU - Kinh Gian Do, R.

AU - Sali, A.

PY - 2000

Y1 - 2000

N2 - Comparative protein structure prediction is limited mostly by the errors in alignment and loop modeling. We describe here a new automated modeling technique that significantly improves the accuracy of loop predictions in protein structures. The positions of all nonhydrogen atoms of the loop are optimized in a fixed environment with respect to a pseudo energy function. The energy is a sum of many spatial restraints that include the bond length, bond angle, and improper dihedral angle terms from the CHARMM-22 force field, statistical preferences for the main-chain and side-chain dihedral angles, and statistical preferences for nonbonded atomic contacts that depend on the two atom types, their distance through space, and separation in sequence. The energy function is optimized with the method of conjugate gradients combined with molecular dynamics and simulated annealing. Typically, the predicted loop conformation corresponds to the lowest energy conformation among 500 independent optimizations. Predictions were made for 40 loops of known structure at each length from 1 to 14 residues. The accuracy of loop predictions is evaluated as a function of thoroughness of conformational sampling, loop length, and structural properties of native loops. When accuracy is measured by local superposition of the model on the native loop, 100, 90, and 30% of 4-, 8-, and 12-residue loop predictions, respectively, had <2 Å RMSD error for the mainchain N, C(α), C, and O atoms; the average accuracies were 0.59 ± 0.05, 1.16 ± 0.10, and 2.61 ± 0.16 Å respectively. To simulate real comparative modeling problems, the method was also evaluated by predicting loops of known structure in only approximately correct environments with errors typical of comparative modeling without misalignment. When the RMSD distortion of the main-chain stem atoms is 2.5 Å, the average loop prediction error increased by 180, 25, and 3% for 4-, 8-, and 12-residue loops, respectively. The accuracy of the lowest energy prediction for a given loop can be estimated from the structural variability among a number of low energy predictions. The relative value of the present method is gauged by (1) comparing it with one of the most successful previously described methods, and (2) describing its accuracy in recent blind predictions of protein structure. Finally, it is shown that the average accuracy of prediction is limited primarily by the accuracy of the energy function rather than by the extent of conformational sampling.

AB - Comparative protein structure prediction is limited mostly by the errors in alignment and loop modeling. We describe here a new automated modeling technique that significantly improves the accuracy of loop predictions in protein structures. The positions of all nonhydrogen atoms of the loop are optimized in a fixed environment with respect to a pseudo energy function. The energy is a sum of many spatial restraints that include the bond length, bond angle, and improper dihedral angle terms from the CHARMM-22 force field, statistical preferences for the main-chain and side-chain dihedral angles, and statistical preferences for nonbonded atomic contacts that depend on the two atom types, their distance through space, and separation in sequence. The energy function is optimized with the method of conjugate gradients combined with molecular dynamics and simulated annealing. Typically, the predicted loop conformation corresponds to the lowest energy conformation among 500 independent optimizations. Predictions were made for 40 loops of known structure at each length from 1 to 14 residues. The accuracy of loop predictions is evaluated as a function of thoroughness of conformational sampling, loop length, and structural properties of native loops. When accuracy is measured by local superposition of the model on the native loop, 100, 90, and 30% of 4-, 8-, and 12-residue loop predictions, respectively, had <2 Å RMSD error for the mainchain N, C(α), C, and O atoms; the average accuracies were 0.59 ± 0.05, 1.16 ± 0.10, and 2.61 ± 0.16 Å respectively. To simulate real comparative modeling problems, the method was also evaluated by predicting loops of known structure in only approximately correct environments with errors typical of comparative modeling without misalignment. When the RMSD distortion of the main-chain stem atoms is 2.5 Å, the average loop prediction error increased by 180, 25, and 3% for 4-, 8-, and 12-residue loops, respectively. The accuracy of the lowest energy prediction for a given loop can be estimated from the structural variability among a number of low energy predictions. The relative value of the present method is gauged by (1) comparing it with one of the most successful previously described methods, and (2) describing its accuracy in recent blind predictions of protein structure. Finally, it is shown that the average accuracy of prediction is limited primarily by the accuracy of the energy function rather than by the extent of conformational sampling.

KW - Comparative or homology protein structure modeling

KW - Loop modeling

UR - http://www.scopus.com/inward/record.url?scp=0033810049&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0033810049&partnerID=8YFLogxK

M3 - Article

C2 - 11045621

AN - SCOPUS:0033810049

VL - 9

SP - 1753

EP - 1773

JO - Protein Science

JF - Protein Science

SN - 0961-8368

IS - 9

ER -