The Protein Structural Initiative (PSI) at the US National Institutes of Health (NIH) is funding four large-scale centers for structural genomics (SG). These centers systematically target many large families without structural coverage, as well as very large families with inadequate structural coverage. Here, we report a few simple metrics that demonstrate how successfully these efforts optimize structural coverage: while the PSI-2 (2005-now) contributed more than 8% of all structures deposited into the PDB, it contributed over 20% of all novel structures (i.e. structures for protein sequences with no structural representative in the PDB on the date of deposition). The structural coverage of the protein universe represented by today's UniProt (v12.8) has increased linearly from 1992 to 2008; structural genomics has contributed significantly to the maintenance of this growth rate. Success in increasing novel leverage (defined in Liu et al. in Nat Biotechnol 25:849-851, 2007) has resulted from systematic targeting of large families. PSI's per structure contribution to novel leverage was over 4-fold higher than that for non-PSI structural biology efforts during the past 8 years. If the success of the PSI continues, it may just take another ∼15 years to cover most sequences in the current UniProt database.
- Protein structure determination
- Protein universe
- Structural genomics
ASJC Scopus subject areas
- Structural Biology