TY - JOUR
T1 - The SEQC2 epigenomics quality control (EpiQC) study
AU - Foox, Jonathan
AU - Nordlund, Jessica
AU - Lalancette, Claudia
AU - Gong, Ting
AU - Lacey, Michelle
AU - Lent, Samantha
AU - Langhorst, Bradley W.
AU - Ponnaluri, V. K.Chaithanya
AU - Williams, Louise
AU - Padmanabhan, Karthik Ramaswamy
AU - Cavalcante, Raymond
AU - Lundmark, Anders
AU - Butler, Daniel
AU - Mozsary, Christopher
AU - Gurvitch, Justin
AU - Greally, John M.
AU - Suzuki, Masako
AU - Menor, Mark
AU - Nasu, Masaki
AU - Alonso, Alicia
AU - Sheridan, Caroline
AU - Scherer, Andreas
AU - Bruinsma, Stephen
AU - Golda, Gosia
AU - Muszynska, Agata
AU - Łabaj, Paweł P.
AU - Campbell, Matthew A.
AU - Wos, Frank
AU - Raine, Amanda
AU - Liljedahl, Ulrika
AU - Axelsson, Tomas
AU - Wang, Charles
AU - Chen, Zhong
AU - Yang, Zhaowei
AU - Li, Jing
AU - Yang, Xiaopeng
AU - Wang, Hongwei
AU - Melnick, Ari
AU - Guo, Shang
AU - Blume, Alexander
AU - Franke, Vedran
AU - Ibanez de Caceres, Inmaculada
AU - Rodriguez-Antolin, Carlos
AU - Rosas, Rocio
AU - Davis, Justin Wade
AU - Ishii, Jennifer
AU - Megherbi, Dalila B.
AU - Xiao, Wenming
AU - Liao, Will
AU - Xu, Joshua
AU - Hong, Huixiao
AU - Ning, Baitang
AU - Tong, Weida
AU - Akalin, Altuna
AU - Wang, Yunliang
AU - Deng, Youping
AU - Mason, Christopher E.
N1 - Funding Information:
The authors wish to thank Justin Zook for contributions to study design and advice for bioinformatics analysis. J.N, A.L, U.L, T.A, and A.R are supported by grants from the Swedish Research Council (2017-00630 / 2019-01976). I.I.C, R.R, and C.R.A are supported by ISCIII, project number PI18/00050. This project received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement No 824110-EASI-Genomics. T.G and Y.P.D are supported by NIH Grants 5P30GM114737, P20GM103466, U54 MD007584, and 2U54MD007601. The genomic work carried out at the Loma Linda University Center for Genomics was funded in part by the National Institutes of Health (NIH) grant S10OD019960 (CW). This project is partially supported by AHA grant 18IPA34170301 (CW). We would also like to thank the Epigenomics Core Facility at Weill Cornell Medicine, the Starr Cancer Consortium (I13-0052), WorldQuant, The Pershing Square Sohn Cancer Research Alliance, NASA (NNX14AH50G, NNX17AB26G), the NIH (R01MH117406, R01CA249054, R01AI151059, P01CA214274, U01DA053941), and the Leukemia and Lymphoma Society (LLS) MCL7001-18, LLS 9238-16, LLS-MCL7001-18) Barbara Cheifet was the primary editor of this article and managed its editorial process and peer review in collaboration with the rest of the editorial team. The views presented in this article do not necessarily reflect those of the U.S. Food and Drug Administration. Any mention of commercial products is for clarification and is not intended as an endorsement.
Funding Information:
The authors wish to thank Justin Zook for contributions to study design and advice for bioinformatics analysis. J.N, A.L, U.L, T.A, and A.R are supported by grants from the Swedish Research Council (2017-00630 / 2019-01976). I.I.C, R.R, and C.R.A are supported by ISCIII, project number PI18/00050. This project received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement No 824110-EASI-Genomics. T.G and Y.P.D are supported by NIH Grants 5P30GM114737, P20GM103466, U54 MD007584, and 2U54MD007601. The genomic work carried out at the Loma Linda University Center for Genomics was funded in part by the National Institutes of Health (NIH) grant S10OD019960 (CW). This project is partially supported by AHA grant 18IPA34170301 (CW). We would also like to thank the Epigenomics Core Facility at Weill Cornell Medicine, the Starr Cancer Consortium (I13-0052), WorldQuant, The Pershing Square Sohn Cancer Research Alliance, NASA (NNX14AH50G, NNX17AB26G), the NIH (R01MH117406, R01CA249054, R01AI151059, P01CA214274, U01DA053941), and the Leukemia and Lymphoma Society (LLS) MCL7001-18, LLS 9238-16, LLS-MCL7001-18)
Publisher Copyright:
© 2021, The Author(s).
PY - 2021/12
Y1 - 2021/12
N2 - Background: Cytosine modifications in DNA such as 5-methylcytosine (5mC) underlie a broad range of developmental processes, maintain cellular lineage specification, and can define or stratify types of cancer and other diseases. However, the wide variety of approaches available to interrogate these modifications has created a need for harmonized materials, methods, and rigorous benchmarking to improve genome-wide methylome sequencing applications in clinical and basic research. Here, we present a multi-platform assessment and cross-validated resource for epigenetics research from the FDA’s Epigenomics Quality Control Group. Results: Each sample is processed in multiple replicates by three whole-genome bisulfite sequencing (WGBS) protocols (TruSeq DNA methylation, Accel-NGS MethylSeq, and SPLAT), oxidative bisulfite sequencing (TrueMethyl), enzymatic deamination method (EMSeq), targeted methylation sequencing (Illumina Methyl Capture EPIC), single-molecule long-read nanopore sequencing from Oxford Nanopore Technologies, and 850k Illumina methylation arrays. After rigorous quality assessment and comparison to Illumina EPIC methylation microarrays and testing on a range of algorithms (Bismark, BitmapperBS, bwa-meth, and BitMapperBS), we find overall high concordance between assays, but also differences in efficiency of read mapping, CpG capture, coverage, and platform performance, and variable performance across 26 microarray normalization algorithms. Conclusions: The data provided herein can guide the use of these DNA reference materials in epigenomics research, as well as provide best practices for experimental design in future studies. By leveraging seven human cell lines that are designated as publicly available reference materials, these data can be used as a baseline to advance epigenomics research.
AB - Background: Cytosine modifications in DNA such as 5-methylcytosine (5mC) underlie a broad range of developmental processes, maintain cellular lineage specification, and can define or stratify types of cancer and other diseases. However, the wide variety of approaches available to interrogate these modifications has created a need for harmonized materials, methods, and rigorous benchmarking to improve genome-wide methylome sequencing applications in clinical and basic research. Here, we present a multi-platform assessment and cross-validated resource for epigenetics research from the FDA’s Epigenomics Quality Control Group. Results: Each sample is processed in multiple replicates by three whole-genome bisulfite sequencing (WGBS) protocols (TruSeq DNA methylation, Accel-NGS MethylSeq, and SPLAT), oxidative bisulfite sequencing (TrueMethyl), enzymatic deamination method (EMSeq), targeted methylation sequencing (Illumina Methyl Capture EPIC), single-molecule long-read nanopore sequencing from Oxford Nanopore Technologies, and 850k Illumina methylation arrays. After rigorous quality assessment and comparison to Illumina EPIC methylation microarrays and testing on a range of algorithms (Bismark, BitmapperBS, bwa-meth, and BitMapperBS), we find overall high concordance between assays, but also differences in efficiency of read mapping, CpG capture, coverage, and platform performance, and variable performance across 26 microarray normalization algorithms. Conclusions: The data provided herein can guide the use of these DNA reference materials in epigenomics research, as well as provide best practices for experimental design in future studies. By leveraging seven human cell lines that are designated as publicly available reference materials, these data can be used as a baseline to advance epigenomics research.
UR - http://www.scopus.com/inward/record.url?scp=85120776449&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85120776449&partnerID=8YFLogxK
U2 - 10.1186/s13059-021-02529-2
DO - 10.1186/s13059-021-02529-2
M3 - Article
C2 - 34872606
AN - SCOPUS:85120776449
SN - 1474-7596
VL - 22
JO - Genome Biology
JF - Genome Biology
IS - 1
M1 - 332
ER -