Gene prediction using the Self-Organizing Map

Automatic generation of multiple gene models

Shaun Mahony, James O. McInerney, Terry J. Smith, Aaron Golden

Research output: Contribution to journalArticle

18 Citations (Scopus)

Abstract

Background: Many current gene prediction methods use only one model to represent protein-coding regions in a genome, and so are less likely to predict the location of genes that have an atypical sequence composition. It is likely that future improvements in gene finding will involve the development of methods that can adequately deal with intra-genomic compositional variation. Results: This work explores a new approach to gene-prediction, based on the Self-Organizing Map, which has the ability to automatically identify multiple gene models within a genome. The current implementation, named RescueNet, uses relative synonymous codon usage as the indicator of protein-coding potential. Conclusions: While its raw accuracy rate can be less than other methods, RescueNet consistently identifies some genes that other methods do not, and should therefore be of interest to geneprediction software developers and genome annotation teams alike. RescueNet is recommended for use in conjunction with, or as a complement to, other gene prediction methods.

Original languageEnglish (US)
Article number23
JournalBMC Bioinformatics
Volume5
DOIs
StatePublished - Mar 5 2004
Externally publishedYes

Fingerprint

Self organizing maps
Self-organizing Map
Genes
Gene
Prediction
Genome
Model
Coding
Likely
Protein
Alike
Codon
Proteins
Open Reading Frames
Genomics
Annotation
Software
Complement
Predict

ASJC Scopus subject areas

  • Medicine(all)
  • Structural Biology
  • Applied Mathematics

Cite this

Gene prediction using the Self-Organizing Map : Automatic generation of multiple gene models. / Mahony, Shaun; McInerney, James O.; Smith, Terry J.; Golden, Aaron.

In: BMC Bioinformatics, Vol. 5, 23, 05.03.2004.

Research output: Contribution to journalArticle

Mahony, Shaun ; McInerney, James O. ; Smith, Terry J. ; Golden, Aaron. / Gene prediction using the Self-Organizing Map : Automatic generation of multiple gene models. In: BMC Bioinformatics. 2004 ; Vol. 5.
@article{e7225feccf3d4187bebb43864d701fdb,
title = "Gene prediction using the Self-Organizing Map: Automatic generation of multiple gene models",
abstract = "Background: Many current gene prediction methods use only one model to represent protein-coding regions in a genome, and so are less likely to predict the location of genes that have an atypical sequence composition. It is likely that future improvements in gene finding will involve the development of methods that can adequately deal with intra-genomic compositional variation. Results: This work explores a new approach to gene-prediction, based on the Self-Organizing Map, which has the ability to automatically identify multiple gene models within a genome. The current implementation, named RescueNet, uses relative synonymous codon usage as the indicator of protein-coding potential. Conclusions: While its raw accuracy rate can be less than other methods, RescueNet consistently identifies some genes that other methods do not, and should therefore be of interest to geneprediction software developers and genome annotation teams alike. RescueNet is recommended for use in conjunction with, or as a complement to, other gene prediction methods.",
author = "Shaun Mahony and McInerney, {James O.} and Smith, {Terry J.} and Aaron Golden",
year = "2004",
month = "3",
day = "5",
doi = "10.1186/1471-2105-5-23",
language = "English (US)",
volume = "5",
journal = "BMC Bioinformatics",
issn = "1471-2105",
publisher = "BioMed Central",

}

TY - JOUR

T1 - Gene prediction using the Self-Organizing Map

T2 - Automatic generation of multiple gene models

AU - Mahony, Shaun

AU - McInerney, James O.

AU - Smith, Terry J.

AU - Golden, Aaron

PY - 2004/3/5

Y1 - 2004/3/5

N2 - Background: Many current gene prediction methods use only one model to represent protein-coding regions in a genome, and so are less likely to predict the location of genes that have an atypical sequence composition. It is likely that future improvements in gene finding will involve the development of methods that can adequately deal with intra-genomic compositional variation. Results: This work explores a new approach to gene-prediction, based on the Self-Organizing Map, which has the ability to automatically identify multiple gene models within a genome. The current implementation, named RescueNet, uses relative synonymous codon usage as the indicator of protein-coding potential. Conclusions: While its raw accuracy rate can be less than other methods, RescueNet consistently identifies some genes that other methods do not, and should therefore be of interest to geneprediction software developers and genome annotation teams alike. RescueNet is recommended for use in conjunction with, or as a complement to, other gene prediction methods.

AB - Background: Many current gene prediction methods use only one model to represent protein-coding regions in a genome, and so are less likely to predict the location of genes that have an atypical sequence composition. It is likely that future improvements in gene finding will involve the development of methods that can adequately deal with intra-genomic compositional variation. Results: This work explores a new approach to gene-prediction, based on the Self-Organizing Map, which has the ability to automatically identify multiple gene models within a genome. The current implementation, named RescueNet, uses relative synonymous codon usage as the indicator of protein-coding potential. Conclusions: While its raw accuracy rate can be less than other methods, RescueNet consistently identifies some genes that other methods do not, and should therefore be of interest to geneprediction software developers and genome annotation teams alike. RescueNet is recommended for use in conjunction with, or as a complement to, other gene prediction methods.

UR - http://www.scopus.com/inward/record.url?scp=2942542784&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=2942542784&partnerID=8YFLogxK

U2 - 10.1186/1471-2105-5-23

DO - 10.1186/1471-2105-5-23

M3 - Article

VL - 5

JO - BMC Bioinformatics

JF - BMC Bioinformatics

SN - 1471-2105

M1 - 23

ER -