Improved detection of DNA motifs using a self-organized clustering of familial binding profiles

Shaun Mahony, Aaron Golden, Terry J. Smith, Panayiotis V. Benos

Research output: Contribution to journalArticle

32 Citations (Scopus)

Abstract

Motivation: One of the limiting factors in deciphering transcriptional regulatory networks is the effectiveness of motif-finding software. An emerging avenue for improving motif-finding accuracy aims to incorporate generalized binding constraints of related transcription factors (TFs), named familial binding profiles (FBPs), as priors in motif identification methods. A motif-finder can thus be 'biased' towards finding motifs from a particular TF family. However, current motif-finders allow only a single FBP to be used as a prior in a given motif-finding run. In addition, current FBP construction methods are based on manual clustering of position specific scoring matrices (PSSMs) according to the known structural properties of the TF proteins. Manual clustering assumes that the binding preferences of structurally similar TFs will also be similar. This assumption is not true, at least not for some TF families. Automatic PSSM clustering methods are thus required for augmenting the usefulness of FBPs. Results: A novel method is developed for automatic clustering of PSSM models. The resulting FBPs are incorporated into the SOMBRERO motif-finder, significantly improving its performance when finding motifs related to those that have been incorporated. SOMBRERO is thus the only existing de novo motif-finder that can incorporate knowledge of all known PSSMs in a given motif-finding run.

Original languageEnglish (US)
JournalBioinformatics
Volume21
Issue numberSUPPL. 1
DOIs
StatePublished - Jun 2005
Externally publishedYes

Fingerprint

Position-Specific Scoring Matrices
Nucleotide Motifs
Transcription factors
Cluster Analysis
DNA
Transcription Factors
Transcription Factor
Clustering
Scoring
Gene Regulatory Networks
Structural properties
Regulatory Networks
Software
Matrix Models
Matrix Method
Profile
Clustering Methods
Structural Properties
Proteins
Biased

ASJC Scopus subject areas

  • Clinical Biochemistry
  • Computer Science Applications
  • Computational Theory and Mathematics
  • Biochemistry
  • Molecular Biology
  • Computational Mathematics
  • Statistics and Probability

Cite this

Improved detection of DNA motifs using a self-organized clustering of familial binding profiles. / Mahony, Shaun; Golden, Aaron; Smith, Terry J.; Benos, Panayiotis V.

In: Bioinformatics, Vol. 21, No. SUPPL. 1, 06.2005.

Research output: Contribution to journalArticle

Mahony, Shaun ; Golden, Aaron ; Smith, Terry J. ; Benos, Panayiotis V. / Improved detection of DNA motifs using a self-organized clustering of familial binding profiles. In: Bioinformatics. 2005 ; Vol. 21, No. SUPPL. 1.
@article{4bf232b5c2854df593410fdca6d3f67f,
title = "Improved detection of DNA motifs using a self-organized clustering of familial binding profiles",
abstract = "Motivation: One of the limiting factors in deciphering transcriptional regulatory networks is the effectiveness of motif-finding software. An emerging avenue for improving motif-finding accuracy aims to incorporate generalized binding constraints of related transcription factors (TFs), named familial binding profiles (FBPs), as priors in motif identification methods. A motif-finder can thus be 'biased' towards finding motifs from a particular TF family. However, current motif-finders allow only a single FBP to be used as a prior in a given motif-finding run. In addition, current FBP construction methods are based on manual clustering of position specific scoring matrices (PSSMs) according to the known structural properties of the TF proteins. Manual clustering assumes that the binding preferences of structurally similar TFs will also be similar. This assumption is not true, at least not for some TF families. Automatic PSSM clustering methods are thus required for augmenting the usefulness of FBPs. Results: A novel method is developed for automatic clustering of PSSM models. The resulting FBPs are incorporated into the SOMBRERO motif-finder, significantly improving its performance when finding motifs related to those that have been incorporated. SOMBRERO is thus the only existing de novo motif-finder that can incorporate knowledge of all known PSSMs in a given motif-finding run.",
author = "Shaun Mahony and Aaron Golden and Smith, {Terry J.} and Benos, {Panayiotis V.}",
year = "2005",
month = "6",
doi = "10.1093/bioinformatics/bti1025",
language = "English (US)",
volume = "21",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "SUPPL. 1",

}

TY - JOUR

T1 - Improved detection of DNA motifs using a self-organized clustering of familial binding profiles

AU - Mahony, Shaun

AU - Golden, Aaron

AU - Smith, Terry J.

AU - Benos, Panayiotis V.

PY - 2005/6

Y1 - 2005/6

N2 - Motivation: One of the limiting factors in deciphering transcriptional regulatory networks is the effectiveness of motif-finding software. An emerging avenue for improving motif-finding accuracy aims to incorporate generalized binding constraints of related transcription factors (TFs), named familial binding profiles (FBPs), as priors in motif identification methods. A motif-finder can thus be 'biased' towards finding motifs from a particular TF family. However, current motif-finders allow only a single FBP to be used as a prior in a given motif-finding run. In addition, current FBP construction methods are based on manual clustering of position specific scoring matrices (PSSMs) according to the known structural properties of the TF proteins. Manual clustering assumes that the binding preferences of structurally similar TFs will also be similar. This assumption is not true, at least not for some TF families. Automatic PSSM clustering methods are thus required for augmenting the usefulness of FBPs. Results: A novel method is developed for automatic clustering of PSSM models. The resulting FBPs are incorporated into the SOMBRERO motif-finder, significantly improving its performance when finding motifs related to those that have been incorporated. SOMBRERO is thus the only existing de novo motif-finder that can incorporate knowledge of all known PSSMs in a given motif-finding run.

AB - Motivation: One of the limiting factors in deciphering transcriptional regulatory networks is the effectiveness of motif-finding software. An emerging avenue for improving motif-finding accuracy aims to incorporate generalized binding constraints of related transcription factors (TFs), named familial binding profiles (FBPs), as priors in motif identification methods. A motif-finder can thus be 'biased' towards finding motifs from a particular TF family. However, current motif-finders allow only a single FBP to be used as a prior in a given motif-finding run. In addition, current FBP construction methods are based on manual clustering of position specific scoring matrices (PSSMs) according to the known structural properties of the TF proteins. Manual clustering assumes that the binding preferences of structurally similar TFs will also be similar. This assumption is not true, at least not for some TF families. Automatic PSSM clustering methods are thus required for augmenting the usefulness of FBPs. Results: A novel method is developed for automatic clustering of PSSM models. The resulting FBPs are incorporated into the SOMBRERO motif-finder, significantly improving its performance when finding motifs related to those that have been incorporated. SOMBRERO is thus the only existing de novo motif-finder that can incorporate knowledge of all known PSSMs in a given motif-finding run.

UR - http://www.scopus.com/inward/record.url?scp=29144465554&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=29144465554&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/bti1025

DO - 10.1093/bioinformatics/bti1025

M3 - Article

C2 - 15961468

AN - SCOPUS:29144465554

VL - 21

JO - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - SUPPL. 1

ER -