Self-organizing maps of position weight matrices for motif discovery in biological sequences

Shaun Mahony; David Hendrix; Terry J. Smith; Aaron Golden

doi:10.1007/s10462-005-9011-9

Self-organizing maps of position weight matrices for motif discovery in biological sequences

Shaun Mahony, David Hendrix, Terry J. Smith, Aaron Golden

Research output: Contribution to journal › Article › peer-review

11 Scopus citations

Abstract

The identification of overrepresented motifs in a collection of biological sequences continues to be a relevant and challenging problem in computational biology. Currently popular methods of motif discovery are based on statistical learning theory. In this paper, a machine-learning approach to the motif discovery problem is explored. The approach is based on a Self-Organizing Map (SOM) where the output layer neuron weight vectors are replaced by position weight matrices. This approach can be used to characterise features present in a set of sequences, and thus can be used as an aid in overrepresented motif discovery. The SOM approach to motif discovery is demonstrated using biological sequence datasets, both real and simulated.

Original language	English (US)
Pages (from-to)	397-413
Number of pages	17
Journal	Artificial Intelligence Review
Volume	24
Issue number	3-4
DOIs	https://doi.org/10.1007/s10462-005-9011-9
State	Published - Nov 2005
Externally published	Yes

Keywords

Biological motif discovery
Self-organizing map

ASJC Scopus subject areas

Language and Linguistics
Linguistics and Language
Artificial Intelligence

Access to Document

10.1007/s10462-005-9011-9

Cite this

@article{26a636aa814d49cc874f6a86843f1975,

title = "Self-organizing maps of position weight matrices for motif discovery in biological sequences",

abstract = "The identification of overrepresented motifs in a collection of biological sequences continues to be a relevant and challenging problem in computational biology. Currently popular methods of motif discovery are based on statistical learning theory. In this paper, a machine-learning approach to the motif discovery problem is explored. The approach is based on a Self-Organizing Map (SOM) where the output layer neuron weight vectors are replaced by position weight matrices. This approach can be used to characterise features present in a set of sequences, and thus can be used as an aid in overrepresented motif discovery. The SOM approach to motif discovery is demonstrated using biological sequence datasets, both real and simulated.",

keywords = "Biological motif discovery, Self-organizing map",

author = "Shaun Mahony and David Hendrix and Smith, {Terry J.} and Aaron Golden",

note = "Funding Information: S.M. is funded by the Irish Research Council for Science, Engineering and Technology. The authors also wish to thank the NUI, Galway – University of California EAP program for facilitating this work. Copyright: Copyright 2008 Elsevier B.V., All rights reserved.",

year = "2005",

month = nov,

doi = "10.1007/s10462-005-9011-9",

language = "English (US)",

volume = "24",

pages = "397--413",

journal = "Artificial Intelligence Review",

issn = "0269-2821",

publisher = "Springer Netherlands",

number = "3-4",

}

TY - JOUR

T1 - Self-organizing maps of position weight matrices for motif discovery in biological sequences

AU - Mahony, Shaun

AU - Hendrix, David

AU - Smith, Terry J.

AU - Golden, Aaron

N1 - Funding Information: S.M. is funded by the Irish Research Council for Science, Engineering and Technology. The authors also wish to thank the NUI, Galway – University of California EAP program for facilitating this work. Copyright: Copyright 2008 Elsevier B.V., All rights reserved.

PY - 2005/11

Y1 - 2005/11

N2 - The identification of overrepresented motifs in a collection of biological sequences continues to be a relevant and challenging problem in computational biology. Currently popular methods of motif discovery are based on statistical learning theory. In this paper, a machine-learning approach to the motif discovery problem is explored. The approach is based on a Self-Organizing Map (SOM) where the output layer neuron weight vectors are replaced by position weight matrices. This approach can be used to characterise features present in a set of sequences, and thus can be used as an aid in overrepresented motif discovery. The SOM approach to motif discovery is demonstrated using biological sequence datasets, both real and simulated.

AB - The identification of overrepresented motifs in a collection of biological sequences continues to be a relevant and challenging problem in computational biology. Currently popular methods of motif discovery are based on statistical learning theory. In this paper, a machine-learning approach to the motif discovery problem is explored. The approach is based on a Self-Organizing Map (SOM) where the output layer neuron weight vectors are replaced by position weight matrices. This approach can be used to characterise features present in a set of sequences, and thus can be used as an aid in overrepresented motif discovery. The SOM approach to motif discovery is demonstrated using biological sequence datasets, both real and simulated.

KW - Biological motif discovery

KW - Self-organizing map

UR - http://www.scopus.com/inward/record.url?scp=29144501231&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=29144501231&partnerID=8YFLogxK

U2 - 10.1007/s10462-005-9011-9

DO - 10.1007/s10462-005-9011-9

M3 - Article

AN - SCOPUS:29144501231

SN - 0269-2821

VL - 24

SP - 397

EP - 413

JO - Artificial Intelligence Review

JF - Artificial Intelligence Review

IS - 3-4

ER -

Self-organizing maps of position weight matrices for motif discovery in biological sequences

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this