Design and statistical analysis of pooled next generation sequencing for rare variants

Tao Wang; Chang Yun Lin; Yuanhao Zhang; Ruofeng Wen; Kenny Ye

doi:10.1155/2012/524724

Design and statistical analysis of pooled next generation sequencing for rare variants

Tao Wang, Chang Yun Lin, Yuanhao Zhang, Ruofeng Wen, Kenny Ye

Epidemiology & Population Health

Research output: Contribution to journal › Article › peer-review

4 Scopus citations

Abstract

Next generation sequencing (NGS) is a revolutionary technology for biomedical research. One highly cost-efficient application of NGS is to detect disease association based on pooled DNA samples. However, several key issues need to be addressed for pooled NGS. One of them is the high sequencing error rate and its high variability across genomic positions and experiment runs, which, if not well considered in the experimental design and analysis, could lead to either inflated false positive rates or loss in statistical power. Another important issue is how to test association of a group of rare variants. To address the first issue, we proposed a new blocked pooling design in which multiple pools of DNA samples from cases and controls are sequenced together on same NGS functional units. To address the second issue, we proposed a testing procedure that does not require individual genotypes but by taking advantage of multiple DNA pools. Through a simulation study, we demonstrated that our approach provides a good control of the type I error rate, and yields satisfactory power compared to the test-based on individual genotypes. Our results also provide guidelines for designing an efficient pooled.

Original language	English (US)
Article number	524724
Journal	Journal of Probability and Statistics
DOIs	https://doi.org/10.1155/2012/524724
State	Published - 2012

ASJC Scopus subject areas

Statistics and Probability

Access to Document

10.1155/2012/524724

Cite this

@article{b57fafba514f4d36bfbde7932abd0185,

title = "Design and statistical analysis of pooled next generation sequencing for rare variants",

abstract = "Next generation sequencing (NGS) is a revolutionary technology for biomedical research. One highly cost-efficient application of NGS is to detect disease association based on pooled DNA samples. However, several key issues need to be addressed for pooled NGS. One of them is the high sequencing error rate and its high variability across genomic positions and experiment runs, which, if not well considered in the experimental design and analysis, could lead to either inflated false positive rates or loss in statistical power. Another important issue is how to test association of a group of rare variants. To address the first issue, we proposed a new blocked pooling design in which multiple pools of DNA samples from cases and controls are sequenced together on same NGS functional units. To address the second issue, we proposed a testing procedure that does not require individual genotypes but by taking advantage of multiple DNA pools. Through a simulation study, we demonstrated that our approach provides a good control of the type I error rate, and yields satisfactory power compared to the test-based on individual genotypes. Our results also provide guidelines for designing an efficient pooled.",

author = "Tao Wang and Lin, {Chang Yun} and Yuanhao Zhang and Ruofeng Wen and Kenny Ye",

year = "2012",

doi = "10.1155/2012/524724",

language = "English (US)",

journal = "Journal of Probability and Statistics",

issn = "1687-952X",

publisher = "Hindawi Publishing Corporation",

}

TY - JOUR

T1 - Design and statistical analysis of pooled next generation sequencing for rare variants

AU - Wang, Tao

AU - Lin, Chang Yun

AU - Zhang, Yuanhao

AU - Wen, Ruofeng

AU - Ye, Kenny

PY - 2012

Y1 - 2012

N2 - Next generation sequencing (NGS) is a revolutionary technology for biomedical research. One highly cost-efficient application of NGS is to detect disease association based on pooled DNA samples. However, several key issues need to be addressed for pooled NGS. One of them is the high sequencing error rate and its high variability across genomic positions and experiment runs, which, if not well considered in the experimental design and analysis, could lead to either inflated false positive rates or loss in statistical power. Another important issue is how to test association of a group of rare variants. To address the first issue, we proposed a new blocked pooling design in which multiple pools of DNA samples from cases and controls are sequenced together on same NGS functional units. To address the second issue, we proposed a testing procedure that does not require individual genotypes but by taking advantage of multiple DNA pools. Through a simulation study, we demonstrated that our approach provides a good control of the type I error rate, and yields satisfactory power compared to the test-based on individual genotypes. Our results also provide guidelines for designing an efficient pooled.

AB - Next generation sequencing (NGS) is a revolutionary technology for biomedical research. One highly cost-efficient application of NGS is to detect disease association based on pooled DNA samples. However, several key issues need to be addressed for pooled NGS. One of them is the high sequencing error rate and its high variability across genomic positions and experiment runs, which, if not well considered in the experimental design and analysis, could lead to either inflated false positive rates or loss in statistical power. Another important issue is how to test association of a group of rare variants. To address the first issue, we proposed a new blocked pooling design in which multiple pools of DNA samples from cases and controls are sequenced together on same NGS functional units. To address the second issue, we proposed a testing procedure that does not require individual genotypes but by taking advantage of multiple DNA pools. Through a simulation study, we demonstrated that our approach provides a good control of the type I error rate, and yields satisfactory power compared to the test-based on individual genotypes. Our results also provide guidelines for designing an efficient pooled.

UR - http://www.scopus.com/inward/record.url?scp=84867012853&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84867012853&partnerID=8YFLogxK

U2 - 10.1155/2012/524724

DO - 10.1155/2012/524724

M3 - Article

AN - SCOPUS:84867012853

SN - 1687-952X

JO - Journal of Probability and Statistics

JF - Journal of Probability and Statistics

M1 - 524724

ER -

Design and statistical analysis of pooled next generation sequencing for rare variants

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this