Picking ChIP-seq peak detectors for analyzing chromatin modification experiments

Mariann Micsinai; Fabio Parisi; Francesco Strino; Patrik Asp; Brian D. Dynlacht; Yuval Kluger

doi:10.1093/nar/gks048

Picking ChIP-seq peak detectors for analyzing chromatin modification experiments

Mariann Micsinai, Fabio Parisi, Francesco Strino, Patrik Asp, Brian D. Dynlacht, Yuval Kluger

Surgery

Research output: Contribution to journal › Article › peer-review

57 Scopus citations

Abstract

Numerous algorithms have been developed to analyze ChIP-Seq data. However, the complexity of analyzing diverse patterns of ChIP-Seq signals, especially for epigenetic marks, still calls for the development of new algorithms and objective comparisons of existing methods. We developed Qeseq, an algorithm to detect regions of increased ChIP read density relative to background. Qeseq employs critical novel elements, such as iterative recalibration and neighbor joining of reads to identify enriched regions of any length. To objectively assess its performance relative to other 14 ChIP-Seq peak finders, we designed a novel protocol based on Validation Discriminant Analysis (VDA) to optimally select validation sites and generated two validation datasets, which are the most comprehensive to date for algorithmic benchmarking of key epigenetic marks. In addition, we systematically explored a total of 315 diverse parameter configurations from these algorithms and found that typically optimal parameters in one dataset do not generalize to other datasets. Nevertheless, default parameters show the most stable performance, suggesting that they should be used. This study also provides a reproducible and generalizable methodology for unbiased comparative analysis of high-throughput sequencing tools that can facilitate future algorithmic development.

Original language	English (US)
Pages (from-to)	e70
Journal	Nucleic acids research
Volume	40
Issue number	9
DOIs	https://doi.org/10.1093/nar/gks048
State	Published - May 2012

ASJC Scopus subject areas

Genetics

Access to Document

10.1093/nar/gks048

Cite this

@article{1274bfe46f5a42219459ad1febea5083,

title = "Picking ChIP-seq peak detectors for analyzing chromatin modification experiments",

abstract = "Numerous algorithms have been developed to analyze ChIP-Seq data. However, the complexity of analyzing diverse patterns of ChIP-Seq signals, especially for epigenetic marks, still calls for the development of new algorithms and objective comparisons of existing methods. We developed Qeseq, an algorithm to detect regions of increased ChIP read density relative to background. Qeseq employs critical novel elements, such as iterative recalibration and neighbor joining of reads to identify enriched regions of any length. To objectively assess its performance relative to other 14 ChIP-Seq peak finders, we designed a novel protocol based on Validation Discriminant Analysis (VDA) to optimally select validation sites and generated two validation datasets, which are the most comprehensive to date for algorithmic benchmarking of key epigenetic marks. In addition, we systematically explored a total of 315 diverse parameter configurations from these algorithms and found that typically optimal parameters in one dataset do not generalize to other datasets. Nevertheless, default parameters show the most stable performance, suggesting that they should be used. This study also provides a reproducible and generalizable methodology for unbiased comparative analysis of high-throughput sequencing tools that can facilitate future algorithmic development.",

author = "Mariann Micsinai and Fabio Parisi and Francesco Strino and Patrik Asp and Dynlacht, {Brian D.} and Yuval Kluger",

note = "Funding Information: National Institutes of Health [Research Grant from the National Cancer Institute (CA-16359 to Y.K.) and (GM067132, GM067132-07S1 to B.D.)]; National Science Foundation (IGERT0333389 to M.M.); and by the American-Italian Cancer Foundation (Post-Doctoral Research Fellowship to F.S.). Funding for open access charge: Discretionary funds.",

year = "2012",

month = may,

doi = "10.1093/nar/gks048",

language = "English (US)",

volume = "40",

pages = "e70",

journal = "Nucleic acids research",

issn = "0305-1048",

publisher = "Oxford University Press",

number = "9",

}

TY - JOUR

T1 - Picking ChIP-seq peak detectors for analyzing chromatin modification experiments

AU - Micsinai, Mariann

AU - Parisi, Fabio

AU - Strino, Francesco

AU - Asp, Patrik

AU - Dynlacht, Brian D.

AU - Kluger, Yuval

N1 - Funding Information: National Institutes of Health [Research Grant from the National Cancer Institute (CA-16359 to Y.K.) and (GM067132, GM067132-07S1 to B.D.)]; National Science Foundation (IGERT0333389 to M.M.); and by the American-Italian Cancer Foundation (Post-Doctoral Research Fellowship to F.S.). Funding for open access charge: Discretionary funds.

PY - 2012/5

Y1 - 2012/5

N2 - Numerous algorithms have been developed to analyze ChIP-Seq data. However, the complexity of analyzing diverse patterns of ChIP-Seq signals, especially for epigenetic marks, still calls for the development of new algorithms and objective comparisons of existing methods. We developed Qeseq, an algorithm to detect regions of increased ChIP read density relative to background. Qeseq employs critical novel elements, such as iterative recalibration and neighbor joining of reads to identify enriched regions of any length. To objectively assess its performance relative to other 14 ChIP-Seq peak finders, we designed a novel protocol based on Validation Discriminant Analysis (VDA) to optimally select validation sites and generated two validation datasets, which are the most comprehensive to date for algorithmic benchmarking of key epigenetic marks. In addition, we systematically explored a total of 315 diverse parameter configurations from these algorithms and found that typically optimal parameters in one dataset do not generalize to other datasets. Nevertheless, default parameters show the most stable performance, suggesting that they should be used. This study also provides a reproducible and generalizable methodology for unbiased comparative analysis of high-throughput sequencing tools that can facilitate future algorithmic development.

AB - Numerous algorithms have been developed to analyze ChIP-Seq data. However, the complexity of analyzing diverse patterns of ChIP-Seq signals, especially for epigenetic marks, still calls for the development of new algorithms and objective comparisons of existing methods. We developed Qeseq, an algorithm to detect regions of increased ChIP read density relative to background. Qeseq employs critical novel elements, such as iterative recalibration and neighbor joining of reads to identify enriched regions of any length. To objectively assess its performance relative to other 14 ChIP-Seq peak finders, we designed a novel protocol based on Validation Discriminant Analysis (VDA) to optimally select validation sites and generated two validation datasets, which are the most comprehensive to date for algorithmic benchmarking of key epigenetic marks. In addition, we systematically explored a total of 315 diverse parameter configurations from these algorithms and found that typically optimal parameters in one dataset do not generalize to other datasets. Nevertheless, default parameters show the most stable performance, suggesting that they should be used. This study also provides a reproducible and generalizable methodology for unbiased comparative analysis of high-throughput sequencing tools that can facilitate future algorithmic development.

UR - http://www.scopus.com/inward/record.url?scp=84861397707&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84861397707&partnerID=8YFLogxK

U2 - 10.1093/nar/gks048

DO - 10.1093/nar/gks048

M3 - Article

C2 - 22307239

AN - SCOPUS:84861397707

SN - 0305-1048

VL - 40

SP - e70

JO - Nucleic acids research

JF - Nucleic acids research

IS - 9

ER -

Picking ChIP-seq peak detectors for analyzing chromatin modification experiments

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this