MODBASE, a database of annotated comparative protein structure models, and associated resources

Ursula Pieper, Narayanan Eswar, Hannes Braberg, M. S. Madhusudhan, Fred P. Davis, Ashley C. Stuart, Nebojsa Mirkovic, Andrea Rossi, Marc A. Marti-Renom, Andras Fiser, Ben Webb, Daniel Greenblatt, Conrad C. Huang, Thomas E. Ferrin, Andrej Sali

Research output: Contribution to journalArticle

186 Citations (Scopus)

Abstract

MODBASE (http://salilab.org/modbase) is a relational database of annotated comparative protein structure models for all available protein sequences matched to at least one known protein structure. The models are calculated by MODPIPE, an automated modeling pipeline that relies on the MODELLER package for fold assignment, sequence-structure alignment, model building and model assessment (http:/salilab.org/modeller). MODBASE uses the MySQL relational database management system for flexible querying and CHIMERA for viewing the sequences and structures (http://www.cgl.ucsf.edu/chimera/). MODBASE is updated regularly to reflect the growth in protein sequence and structure databases, as well as improvements in the software for calculating the models. For ease of access, MODBASE is organized into different data sets. The largest data set contains 1 262 629 models for domains in 659 495 out of 1 182 126 unique protein sequences in the complete Swiss-Prot/TrEMBL database (August 25, 2003); only models based on alignments with significant similarity scores and models assessed to have the correct fold despite insignificant alignments are included. Another model data set supports target selection and structure-based annotation by the New York Structural Genomics Research Consortium; e.g. the 53 new structures produced by the consortium allowed us to characterize structurally 24 113 sequences. MODBASE also contains binding site predictions for small ligands and a set of predicted interactions between pairs of modeled sequences from the same genome. Our other resources associated with MODBASE include a comprehensive database of multiple protein structure alignments (DBALI, http://salilab.org/dbali) as well as web servers for automated comparative modeling with MODPIPE (MODWEB, http://salilab.org/modweb), modeling of loops in protein structures (MODLOOP, http://salilab.org/modloop) and predicting functional consequences of single nucleotide polymorphisms (SNPWEB, http://salilab.org/snpweb).

Original languageEnglish (US)
JournalNucleic Acids Research
Volume32
Issue numberDATABASE ISS.
StatePublished - Jan 1 2004

Fingerprint

Protein Databases
Databases
Proteins
Database Management Systems
Sequence Alignment
Genomics
Single Nucleotide Polymorphism
Software
Binding Sites
Genome
Ligands
Growth
Research
Datasets

ASJC Scopus subject areas

  • Genetics

Cite this

Pieper, U., Eswar, N., Braberg, H., Madhusudhan, M. S., Davis, F. P., Stuart, A. C., ... Sali, A. (2004). MODBASE, a database of annotated comparative protein structure models, and associated resources. Nucleic Acids Research, 32(DATABASE ISS.).

MODBASE, a database of annotated comparative protein structure models, and associated resources. / Pieper, Ursula; Eswar, Narayanan; Braberg, Hannes; Madhusudhan, M. S.; Davis, Fred P.; Stuart, Ashley C.; Mirkovic, Nebojsa; Rossi, Andrea; Marti-Renom, Marc A.; Fiser, Andras; Webb, Ben; Greenblatt, Daniel; Huang, Conrad C.; Ferrin, Thomas E.; Sali, Andrej.

In: Nucleic Acids Research, Vol. 32, No. DATABASE ISS., 01.01.2004.

Research output: Contribution to journalArticle

Pieper, U, Eswar, N, Braberg, H, Madhusudhan, MS, Davis, FP, Stuart, AC, Mirkovic, N, Rossi, A, Marti-Renom, MA, Fiser, A, Webb, B, Greenblatt, D, Huang, CC, Ferrin, TE & Sali, A 2004, 'MODBASE, a database of annotated comparative protein structure models, and associated resources', Nucleic Acids Research, vol. 32, no. DATABASE ISS..
Pieper U, Eswar N, Braberg H, Madhusudhan MS, Davis FP, Stuart AC et al. MODBASE, a database of annotated comparative protein structure models, and associated resources. Nucleic Acids Research. 2004 Jan 1;32(DATABASE ISS.).
Pieper, Ursula ; Eswar, Narayanan ; Braberg, Hannes ; Madhusudhan, M. S. ; Davis, Fred P. ; Stuart, Ashley C. ; Mirkovic, Nebojsa ; Rossi, Andrea ; Marti-Renom, Marc A. ; Fiser, Andras ; Webb, Ben ; Greenblatt, Daniel ; Huang, Conrad C. ; Ferrin, Thomas E. ; Sali, Andrej. / MODBASE, a database of annotated comparative protein structure models, and associated resources. In: Nucleic Acids Research. 2004 ; Vol. 32, No. DATABASE ISS.
@article{f75f25b73ca048bbab30d971b22d1354,
title = "MODBASE, a database of annotated comparative protein structure models, and associated resources",
abstract = "MODBASE (http://salilab.org/modbase) is a relational database of annotated comparative protein structure models for all available protein sequences matched to at least one known protein structure. The models are calculated by MODPIPE, an automated modeling pipeline that relies on the MODELLER package for fold assignment, sequence-structure alignment, model building and model assessment (http:/salilab.org/modeller). MODBASE uses the MySQL relational database management system for flexible querying and CHIMERA for viewing the sequences and structures (http://www.cgl.ucsf.edu/chimera/). MODBASE is updated regularly to reflect the growth in protein sequence and structure databases, as well as improvements in the software for calculating the models. For ease of access, MODBASE is organized into different data sets. The largest data set contains 1 262 629 models for domains in 659 495 out of 1 182 126 unique protein sequences in the complete Swiss-Prot/TrEMBL database (August 25, 2003); only models based on alignments with significant similarity scores and models assessed to have the correct fold despite insignificant alignments are included. Another model data set supports target selection and structure-based annotation by the New York Structural Genomics Research Consortium; e.g. the 53 new structures produced by the consortium allowed us to characterize structurally 24 113 sequences. MODBASE also contains binding site predictions for small ligands and a set of predicted interactions between pairs of modeled sequences from the same genome. Our other resources associated with MODBASE include a comprehensive database of multiple protein structure alignments (DBALI, http://salilab.org/dbali) as well as web servers for automated comparative modeling with MODPIPE (MODWEB, http://salilab.org/modweb), modeling of loops in protein structures (MODLOOP, http://salilab.org/modloop) and predicting functional consequences of single nucleotide polymorphisms (SNPWEB, http://salilab.org/snpweb).",
author = "Ursula Pieper and Narayanan Eswar and Hannes Braberg and Madhusudhan, {M. S.} and Davis, {Fred P.} and Stuart, {Ashley C.} and Nebojsa Mirkovic and Andrea Rossi and Marti-Renom, {Marc A.} and Andras Fiser and Ben Webb and Daniel Greenblatt and Huang, {Conrad C.} and Ferrin, {Thomas E.} and Andrej Sali",
year = "2004",
month = "1",
day = "1",
language = "English (US)",
volume = "32",
journal = "Nucleic Acids Research",
issn = "0305-1048",
publisher = "Oxford University Press",
number = "DATABASE ISS.",

}

TY - JOUR

T1 - MODBASE, a database of annotated comparative protein structure models, and associated resources

AU - Pieper, Ursula

AU - Eswar, Narayanan

AU - Braberg, Hannes

AU - Madhusudhan, M. S.

AU - Davis, Fred P.

AU - Stuart, Ashley C.

AU - Mirkovic, Nebojsa

AU - Rossi, Andrea

AU - Marti-Renom, Marc A.

AU - Fiser, Andras

AU - Webb, Ben

AU - Greenblatt, Daniel

AU - Huang, Conrad C.

AU - Ferrin, Thomas E.

AU - Sali, Andrej

PY - 2004/1/1

Y1 - 2004/1/1

N2 - MODBASE (http://salilab.org/modbase) is a relational database of annotated comparative protein structure models for all available protein sequences matched to at least one known protein structure. The models are calculated by MODPIPE, an automated modeling pipeline that relies on the MODELLER package for fold assignment, sequence-structure alignment, model building and model assessment (http:/salilab.org/modeller). MODBASE uses the MySQL relational database management system for flexible querying and CHIMERA for viewing the sequences and structures (http://www.cgl.ucsf.edu/chimera/). MODBASE is updated regularly to reflect the growth in protein sequence and structure databases, as well as improvements in the software for calculating the models. For ease of access, MODBASE is organized into different data sets. The largest data set contains 1 262 629 models for domains in 659 495 out of 1 182 126 unique protein sequences in the complete Swiss-Prot/TrEMBL database (August 25, 2003); only models based on alignments with significant similarity scores and models assessed to have the correct fold despite insignificant alignments are included. Another model data set supports target selection and structure-based annotation by the New York Structural Genomics Research Consortium; e.g. the 53 new structures produced by the consortium allowed us to characterize structurally 24 113 sequences. MODBASE also contains binding site predictions for small ligands and a set of predicted interactions between pairs of modeled sequences from the same genome. Our other resources associated with MODBASE include a comprehensive database of multiple protein structure alignments (DBALI, http://salilab.org/dbali) as well as web servers for automated comparative modeling with MODPIPE (MODWEB, http://salilab.org/modweb), modeling of loops in protein structures (MODLOOP, http://salilab.org/modloop) and predicting functional consequences of single nucleotide polymorphisms (SNPWEB, http://salilab.org/snpweb).

AB - MODBASE (http://salilab.org/modbase) is a relational database of annotated comparative protein structure models for all available protein sequences matched to at least one known protein structure. The models are calculated by MODPIPE, an automated modeling pipeline that relies on the MODELLER package for fold assignment, sequence-structure alignment, model building and model assessment (http:/salilab.org/modeller). MODBASE uses the MySQL relational database management system for flexible querying and CHIMERA for viewing the sequences and structures (http://www.cgl.ucsf.edu/chimera/). MODBASE is updated regularly to reflect the growth in protein sequence and structure databases, as well as improvements in the software for calculating the models. For ease of access, MODBASE is organized into different data sets. The largest data set contains 1 262 629 models for domains in 659 495 out of 1 182 126 unique protein sequences in the complete Swiss-Prot/TrEMBL database (August 25, 2003); only models based on alignments with significant similarity scores and models assessed to have the correct fold despite insignificant alignments are included. Another model data set supports target selection and structure-based annotation by the New York Structural Genomics Research Consortium; e.g. the 53 new structures produced by the consortium allowed us to characterize structurally 24 113 sequences. MODBASE also contains binding site predictions for small ligands and a set of predicted interactions between pairs of modeled sequences from the same genome. Our other resources associated with MODBASE include a comprehensive database of multiple protein structure alignments (DBALI, http://salilab.org/dbali) as well as web servers for automated comparative modeling with MODPIPE (MODWEB, http://salilab.org/modweb), modeling of loops in protein structures (MODLOOP, http://salilab.org/modloop) and predicting functional consequences of single nucleotide polymorphisms (SNPWEB, http://salilab.org/snpweb).

UR - http://www.scopus.com/inward/record.url?scp=9144244232&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=9144244232&partnerID=8YFLogxK

M3 - Article

C2 - 14681398

AN - SCOPUS:9144244232

VL - 32

JO - Nucleic Acids Research

JF - Nucleic Acids Research

SN - 0305-1048

IS - DATABASE ISS.

ER -