Reducing system noise in copy number data using principal components of self-self hybridizations

Yoon Ha Lee, Michael Ronemus, Jude Kendall, B. Lakshmi, Anthony Leotta, Dan Levy, Diane Esposito, Vladimir Grubor, Qian K. Ye, Michael Wigler, Boris Yamroma

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

Genomic copy number variation underlies genetic disorders such as autism, schizophrenia, and congenital heart disease. Copy number variations are commonly detected by array based comparative genomic hybridization of sample to reference DNAs, but probe and operational variables combine to create correlated system noise that degrades detection of genetic events. To correct for this we have explored hybridizations in which no genetic signal is expected, namely "self-self" hybridizations (SSH) comparing DNAs from the same genome. We show that SSH trap a variety of correlated system noise present also in sample-reference (test) data. Through singular value decomposition of SSH, we are able to determine the principal components (PCs) of this noise. The PCs themselves offer deep insights into the sources of noise, and facilitate detection of artifacts. We present evidence that linear and piece-wise linear correction of test data with the PCs does not introduce detectable spurious signal, yet improves signal-to-noise metrics, reduces false positives, and facilitates copy number determination.

Original languageEnglish (US)
JournalProceedings of the National Academy of Sciences of the United States of America
Volume109
Issue number3
DOIs
StatePublished - Jan 17 2012

Fingerprint

Noise
Inborn Genetic Diseases
Comparative Genomic Hybridization
DNA Probes
Autistic Disorder
Artifacts
Heart Diseases
Schizophrenia
Genome
DNA

Keywords

  • Comparative genomic hybridization
  • Copy number variation
  • Principal component analysis
  • Singular value decomposition

ASJC Scopus subject areas

  • General

Cite this

Reducing system noise in copy number data using principal components of self-self hybridizations. / Lee, Yoon Ha; Ronemus, Michael; Kendall, Jude; Lakshmi, B.; Leotta, Anthony; Levy, Dan; Esposito, Diane; Grubor, Vladimir; Ye, Qian K.; Wigler, Michael; Yamroma, Boris.

In: Proceedings of the National Academy of Sciences of the United States of America, Vol. 109, No. 3, 17.01.2012.

Research output: Contribution to journalArticle

Lee, Yoon Ha ; Ronemus, Michael ; Kendall, Jude ; Lakshmi, B. ; Leotta, Anthony ; Levy, Dan ; Esposito, Diane ; Grubor, Vladimir ; Ye, Qian K. ; Wigler, Michael ; Yamroma, Boris. / Reducing system noise in copy number data using principal components of self-self hybridizations. In: Proceedings of the National Academy of Sciences of the United States of America. 2012 ; Vol. 109, No. 3.
@article{0bd6acb0b9774ccb9b4e2320fba2451c,
title = "Reducing system noise in copy number data using principal components of self-self hybridizations",
abstract = "Genomic copy number variation underlies genetic disorders such as autism, schizophrenia, and congenital heart disease. Copy number variations are commonly detected by array based comparative genomic hybridization of sample to reference DNAs, but probe and operational variables combine to create correlated system noise that degrades detection of genetic events. To correct for this we have explored hybridizations in which no genetic signal is expected, namely {"}self-self{"} hybridizations (SSH) comparing DNAs from the same genome. We show that SSH trap a variety of correlated system noise present also in sample-reference (test) data. Through singular value decomposition of SSH, we are able to determine the principal components (PCs) of this noise. The PCs themselves offer deep insights into the sources of noise, and facilitate detection of artifacts. We present evidence that linear and piece-wise linear correction of test data with the PCs does not introduce detectable spurious signal, yet improves signal-to-noise metrics, reduces false positives, and facilitates copy number determination.",
keywords = "Comparative genomic hybridization, Copy number variation, Principal component analysis, Singular value decomposition",
author = "Lee, {Yoon Ha} and Michael Ronemus and Jude Kendall and B. Lakshmi and Anthony Leotta and Dan Levy and Diane Esposito and Vladimir Grubor and Ye, {Qian K.} and Michael Wigler and Boris Yamroma",
year = "2012",
month = "1",
day = "17",
doi = "10.1073/pnas.1106233109",
language = "English (US)",
volume = "109",
journal = "Proceedings of the National Academy of Sciences of the United States of America",
issn = "0027-8424",
number = "3",

}

TY - JOUR

T1 - Reducing system noise in copy number data using principal components of self-self hybridizations

AU - Lee, Yoon Ha

AU - Ronemus, Michael

AU - Kendall, Jude

AU - Lakshmi, B.

AU - Leotta, Anthony

AU - Levy, Dan

AU - Esposito, Diane

AU - Grubor, Vladimir

AU - Ye, Qian K.

AU - Wigler, Michael

AU - Yamroma, Boris

PY - 2012/1/17

Y1 - 2012/1/17

N2 - Genomic copy number variation underlies genetic disorders such as autism, schizophrenia, and congenital heart disease. Copy number variations are commonly detected by array based comparative genomic hybridization of sample to reference DNAs, but probe and operational variables combine to create correlated system noise that degrades detection of genetic events. To correct for this we have explored hybridizations in which no genetic signal is expected, namely "self-self" hybridizations (SSH) comparing DNAs from the same genome. We show that SSH trap a variety of correlated system noise present also in sample-reference (test) data. Through singular value decomposition of SSH, we are able to determine the principal components (PCs) of this noise. The PCs themselves offer deep insights into the sources of noise, and facilitate detection of artifacts. We present evidence that linear and piece-wise linear correction of test data with the PCs does not introduce detectable spurious signal, yet improves signal-to-noise metrics, reduces false positives, and facilitates copy number determination.

AB - Genomic copy number variation underlies genetic disorders such as autism, schizophrenia, and congenital heart disease. Copy number variations are commonly detected by array based comparative genomic hybridization of sample to reference DNAs, but probe and operational variables combine to create correlated system noise that degrades detection of genetic events. To correct for this we have explored hybridizations in which no genetic signal is expected, namely "self-self" hybridizations (SSH) comparing DNAs from the same genome. We show that SSH trap a variety of correlated system noise present also in sample-reference (test) data. Through singular value decomposition of SSH, we are able to determine the principal components (PCs) of this noise. The PCs themselves offer deep insights into the sources of noise, and facilitate detection of artifacts. We present evidence that linear and piece-wise linear correction of test data with the PCs does not introduce detectable spurious signal, yet improves signal-to-noise metrics, reduces false positives, and facilitates copy number determination.

KW - Comparative genomic hybridization

KW - Copy number variation

KW - Principal component analysis

KW - Singular value decomposition

UR - http://www.scopus.com/inward/record.url?scp=84863029822&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84863029822&partnerID=8YFLogxK

U2 - 10.1073/pnas.1106233109

DO - 10.1073/pnas.1106233109

M3 - Article

VL - 109

JO - Proceedings of the National Academy of Sciences of the United States of America

JF - Proceedings of the National Academy of Sciences of the United States of America

SN - 0027-8424

IS - 3

ER -