Reducing system noise in copy number data using principal components of self-self hybridizations

Yoon Ha Lee, Michael Ronemus, Jude Kendall, B. Lakshmi, Anthony Leotta, Dan Levy, Diane Esposito, Vladimir Grubor, Kenny Ye, Michael Wigler, Boris Yamroma

Research output: Contribution to journalArticlepeer-review

5 Scopus citations


Genomic copy number variation underlies genetic disorders such as autism, schizophrenia, and congenital heart disease. Copy number variations are commonly detected by array based comparative genomic hybridization of sample to reference DNAs, but probe and operational variables combine to create correlated system noise that degrades detection of genetic events. To correct for this we have explored hybridizations in which no genetic signal is expected, namely "self-self" hybridizations (SSH) comparing DNAs from the same genome. We show that SSH trap a variety of correlated system noise present also in sample-reference (test) data. Through singular value decomposition of SSH, we are able to determine the principal components (PCs) of this noise. The PCs themselves offer deep insights into the sources of noise, and facilitate detection of artifacts. We present evidence that linear and piece-wise linear correction of test data with the PCs does not introduce detectable spurious signal, yet improves signal-to-noise metrics, reduces false positives, and facilitates copy number determination.

Original languageEnglish (US)
Pages (from-to)E103-E110
JournalProceedings of the National Academy of Sciences of the United States of America
Issue number3
StatePublished - Jan 17 2012


  • Comparative genomic hybridization
  • Copy number variation
  • Principal component analysis
  • Singular value decomposition

ASJC Scopus subject areas

  • General


Dive into the research topics of 'Reducing system noise in copy number data using principal components of self-self hybridizations'. Together they form a unique fingerprint.

Cite this