Abstract
Background: Thousands of genetic variants have been associated with hematological traits, though target genes remain unknown at most loci. Moreover, limited analyses have been conducted in African ancestry and Hispanic/Latino populations; hematological trait associated variants more common in these populations have likely been missed. Methods: To derive gene expression prediction models, we used ancestry-stratified datasets from the Multi-Ethnic Study of Atherosclerosis (MESA, including n = 229 African American and n = 381 Hispanic/Latino participants, monocytes) and the Depression Genes and Networks study (DGN, n = 922 European ancestry participants, whole blood). We then performed a transcriptome-wide association study (TWAS) for platelet count, hemoglobin, hematocrit, and white blood cell count in African (n = 27,955) and Hispanic/Latino (n = 28,324) ancestry participants. Results: Our results revealed 24 suggestive signals (p < 1 × 10−4 ) that were conditionally distinct from known GWAS identified variants and successfully replicated these signals in European ancestry subjects from UK Biobank. We found modestly improved correlation of predicted and measured gene expression in an independent African American cohort (the Genetic Epidemiology Network of Arteriopathy (GENOA) study (n = 802), lymphoblastoid cell lines) using the larger DGN reference panel; however, some genes were well predicted using MESA but not DGN. Conclusions: These analyses demonstrate the importance of performing TWAS and other genetic analyses across diverse populations and of balancing sample size and ancestry background matching when selecting a TWAS reference panel.
Original language | English (US) |
---|---|
Article number | 1049 |
Journal | Genes |
Volume | 12 |
Issue number | 7 |
DOIs | |
State | Published - Jul 2021 |
Keywords
- Ancestry
- Expression analysis
- Non-European populations
- TWAS (transcriptome-wide association study)
ASJC Scopus subject areas
- Genetics
- Genetics(clinical)
Fingerprint
Dive into the research topics of 'Transcriptome-wide association study of blood cell traits in african ancestry and hispanic/latino populations'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS
In: Genes, Vol. 12, No. 7, 1049, 07.2021.
Research output: Contribution to journal › Article › peer-review
}
TY - JOUR
T1 - Transcriptome-wide association study of blood cell traits in african ancestry and hispanic/latino populations
AU - Wen, Jia
AU - Xie, Munan
AU - Rowland, Bryce
AU - Rosen, Jonathan D.
AU - Sun, Quan
AU - Chen, Jiawen
AU - Tapia, Amanda L.
AU - Qian, Huijun
AU - Kowalski, Madeline H.
AU - Shan, Yue
AU - Young, Kristin L.
AU - Graff, Marielisa
AU - Argos, Maria
AU - Avery, Christy L.
AU - Bien, Stephanie A.
AU - Buyske, Steve
AU - Yin, Jie
AU - Choquet, Hélène
AU - Fornage, Myriam
AU - Hodonsky, Chani J.
AU - Jorgenson, Eric
AU - Kooperberg, Charles
AU - Loos, Ruth J.F.
AU - Liu, Yongmei
AU - Moon, Jee Young
AU - North, Kari E.
AU - Rich, Stephen S.
AU - Rotter, Jerome I.
AU - Smith, Jennifer A.
AU - Zhao, Wei
AU - Shang, Lulu
AU - Wang, Tao
AU - Zhou, Xiang
AU - Reiner, Alexander P.
AU - Raffield, Laura M.
AU - Li, Yun
N1 - Funding Information: The MESA Epigenomics & Transcriptomics Studies were funded by NIH grants 1R01HL101250, 1RF1AG054474, R01HL126477, R01DK101921, and R01HL135009. Genotyping of the GERA cohort was funded by a grant from the National Institute on Aging, National Institute of Mental Health, and National Institute of Health Common Fund (RC2 AG036607). The Population Architecture Using Genomics and Epidemiology (PAGE) program is funded by the National Human Genome Research Institute (NHGRI) with co-funding from the National Institute on Minority Health and Health Disparities (NIMHD). The contents of this paper are solely the responsibility of the authors and do not necessarily represent the official views of the NIH. The PAGE consortium thanks the staff and participants of all PAGE studies for their important contributions. We thank Rasheeda Williams and Margaret Ginoza for providing assistance with program coordination. The complete list of PAGE members can be found at http://www.pagestudy.org (accessed January 2020). Assistance with data management, data integration, data dissemination, genotype imputation, ancestry deconvolution, population genetics, analysis pipelines, and general study coordination was provided by the PAGE Coordinating Center (NIH U01HG007419). Genotyping services were provided by the Center for Inherited Disease Research (CIDR). CIDR is fully funded through a federal contract from the National Institutes of Health to The Johns Hopkins University, contract number HHSN268201200008I. Genotype data quality control and quality assurance services were provided by the Genetic Analysis Center in the Biostatistics Department of the University of Washington, through support provided by the CIDR contract. The data and materials included in this report result from collaboration between the following studies and organizations: HCHS/SOL: Primary funding support to Kari E. North and colleagues is provided by U01HG007416. Additional support was provided via R01DK101855 and 15GRNT25880008. The Hispanic Community Health Study/Study of Latinos was carried out as a collaborative study supported by contracts from the National Heart, Lung, and Blood Institute (NHLBI) to the University of North Carolina (N01-HC65233), University of Miami (N01-HC65234), Albert Einstein College of Medicine (N01-HC65235), Northwestern University (N01-HC65236), and San Diego State University (N01-HC65237). The following Institutes/Centers/Offices contribute to the HCHS/SOL through a transfer of funds to the NHLBI: National Institute on Minority Health and Health Disparities, National Institute on Deafness and Other Communication Disorders, National Institute of Dental and Craniofacial Research, National Institute of Diabetes and Digestive and Kidney Diseases, National Institute of Neurological Disorders and Stroke, NIH Institution-Office of Dietary Supplements. The Genetic Analysis Center at the University of Washington was supported by NHLBI and NIDCR contracts (HHSN268201300005C AM03 and MOD03). WHI: Funding support for the “Exonic variants and their relation to complex traits in minorities of the WHI” study is provided through the NHGRI PAGE program (NIH U01HG007376). The WHI program is funded by the National Heart, Lung, and Blood Institute, National Institutes of Health, U.S. Department of Health and Human Services through contracts HHSN268201100046C, HHSN268201100001C, HHSN268201100002C, HHSN268201100003C, HHSN268201100004C, and HHSN271201100004C. The authors thank the WHI investigators and staff for their dedication, and the study participants for making the program possible. A listing of WHI investigators can be found at: https://www.whi.org/researchers/Documents%20%20Write%20a%20Paper/WHI% 20Investigator%20Long%20List.pdf (accessed January 2020). ARIC: The Atherosclerosis Risk in Communities study has been funded in whole or in part with Federal funds from the National Heart, Lung, and Blood Institute, National Institutes of Health, Department of Health and Human Services, under Contract nos. (HHSN268201700001I, HHSN268201700002I, HHSN268201700003I, HHSN268201700005I, HHSN268201700004I). The authors thank the staff and participants of the ARIC study for their important contributions. CARDIA: The Coronary Artery Risk Development in Young Adults Study (CARDIA) is conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with the University of Alabama at Birmingham (HHSN268201800005I & HHSN268201800007I), Northwestern University (HHSN268201800003I), University of Minnesota (HHSN268201800006I), and Kaiser Foundation Research Institute (HHSN268201800004I). Support for the Genetic Epidemiology Network of Arteriopathy (GENOA) data collection and analysis was provided by the National Heart, Lung and Blood Institute (U01HL054457, RC1HL100185, R01HL119443, R01HL133221, and R01HL141292) and the National Institute of Neurological Disorders and Stroke (R01NS041558) of the National Institutes of Health. Finally, we would like to acknowledge use of the Trans-Omics in Precision Medicine (TOPMed) program imputation panel (freeze 5 version) supported by the National Heart, Lung and Blood Institute (NHLBI); see www.nhlbiwgs.org. TOPMed study investigators contributed data to the reference panel, which was accessed through the Michigan Imputation Server; see https://imputationserver.sph.umich.edu (accessed January 2020). The newest version of the imputation panel (which was not used in this study) can be found at https://imputation.biodatacatalyst.nhlbi.nih.gov. The panel was constructed and implemented by the TOPMed Informatics Research Center at the University of Michigan (3R01HL-117626-02S1; contract HHSN268201800002I). The TOPMed Data Coordinating Center (R01HL-120393; U01HL-120393; contract HHSN268201800001I) provided additional data management, sample identity checks, and overall program coordination and support. We gratefully acknowledge the studies and participants who provided biological samples and data for TOPMed. Funding Information: Acknowledgments: This research has been conducted using the UK Biobank Resource under Application Number 25953. Support for title page creation and format was provided by AuthorArranger, a tool developed at the National Cancer Institute. Whole genome sequencing (WGS) for the Trans-Omics in Precision Medicine (TOPMed) program was supported by the National Heart, Lung and Blood Institute (NHLBI). WGS for “NHLBI TOPMed: Multi-Ethnic Study of Atherosclerosis (MESA)” (phs001416.v1.p1) was performed at the Broad Institute of MIT and Harvard (3U54HG003067-13S1). Centralized read mapping and genotype calling, along with variant quality metrics and filtering were provided by the TOPMed Informatics Research Center (3R01HL-117626-02S1, contract HHSN268201800002I). Phenotype harmonization, data management, sample-identity QC, and general study coordination, were provided by the TOPMed Data Coordinating Center (R01HL-120393; U01HL-120393; contract HHSN268201800001I). MESA and the MESA SHARe project are conducted and supported by the National Heart, Lung, and Blood Institute (NHLBI) in collaboration with MESA investigators. Support for MESA is provided by contracts 75N92020D00001, HHSN268201500003I, N01-HC-95159, 75N92020D00005, N01-HC-95160, 75N92020D00002, N01-HC-95161, 75N92020D00003, N01-HC-95162, 75N92020D00006, N01-HC-95163, 75N92020D00004, N01-HC-95164, 75N92020D00007, N01-HC-95165, N01-HC-95166, N01-HC-95167, N01-HC-95168, N01-HC-95169, UL1-TR-000040, UL1-TR-001079, UL1-TR-001420, UL1-TR-001881, and DK063491. Funding Information: Funding: This project was funded by R01HL129132. The project described was also supported by the National Center for Advancing Translational Sciences, National Institutes of Health, through Grant KL2TR002490 (L.M.R.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. CJH was additionally funded by T32 HL007284. LMR was additionally funded by R01HG010297 and T32 HL129982. YL was additionally funded by R01 GM105785 and U01 DA052713. This material is additionally based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE-1650116 to BR. Publisher Copyright: © 2021 by the authors. Licensee MDPI, Basel, Switzerland.
PY - 2021/7
Y1 - 2021/7
N2 - Background: Thousands of genetic variants have been associated with hematological traits, though target genes remain unknown at most loci. Moreover, limited analyses have been conducted in African ancestry and Hispanic/Latino populations; hematological trait associated variants more common in these populations have likely been missed. Methods: To derive gene expression prediction models, we used ancestry-stratified datasets from the Multi-Ethnic Study of Atherosclerosis (MESA, including n = 229 African American and n = 381 Hispanic/Latino participants, monocytes) and the Depression Genes and Networks study (DGN, n = 922 European ancestry participants, whole blood). We then performed a transcriptome-wide association study (TWAS) for platelet count, hemoglobin, hematocrit, and white blood cell count in African (n = 27,955) and Hispanic/Latino (n = 28,324) ancestry participants. Results: Our results revealed 24 suggestive signals (p < 1 × 10−4 ) that were conditionally distinct from known GWAS identified variants and successfully replicated these signals in European ancestry subjects from UK Biobank. We found modestly improved correlation of predicted and measured gene expression in an independent African American cohort (the Genetic Epidemiology Network of Arteriopathy (GENOA) study (n = 802), lymphoblastoid cell lines) using the larger DGN reference panel; however, some genes were well predicted using MESA but not DGN. Conclusions: These analyses demonstrate the importance of performing TWAS and other genetic analyses across diverse populations and of balancing sample size and ancestry background matching when selecting a TWAS reference panel.
AB - Background: Thousands of genetic variants have been associated with hematological traits, though target genes remain unknown at most loci. Moreover, limited analyses have been conducted in African ancestry and Hispanic/Latino populations; hematological trait associated variants more common in these populations have likely been missed. Methods: To derive gene expression prediction models, we used ancestry-stratified datasets from the Multi-Ethnic Study of Atherosclerosis (MESA, including n = 229 African American and n = 381 Hispanic/Latino participants, monocytes) and the Depression Genes and Networks study (DGN, n = 922 European ancestry participants, whole blood). We then performed a transcriptome-wide association study (TWAS) for platelet count, hemoglobin, hematocrit, and white blood cell count in African (n = 27,955) and Hispanic/Latino (n = 28,324) ancestry participants. Results: Our results revealed 24 suggestive signals (p < 1 × 10−4 ) that were conditionally distinct from known GWAS identified variants and successfully replicated these signals in European ancestry subjects from UK Biobank. We found modestly improved correlation of predicted and measured gene expression in an independent African American cohort (the Genetic Epidemiology Network of Arteriopathy (GENOA) study (n = 802), lymphoblastoid cell lines) using the larger DGN reference panel; however, some genes were well predicted using MESA but not DGN. Conclusions: These analyses demonstrate the importance of performing TWAS and other genetic analyses across diverse populations and of balancing sample size and ancestry background matching when selecting a TWAS reference panel.
KW - Ancestry
KW - Expression analysis
KW - Non-European populations
KW - TWAS (transcriptome-wide association study)
UR - http://www.scopus.com/inward/record.url?scp=85110803376&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85110803376&partnerID=8YFLogxK
U2 - 10.3390/genes12071049
DO - 10.3390/genes12071049
M3 - Article
C2 - 34356065
AN - SCOPUS:85110803376
SN - 2073-4425
VL - 12
JO - Genes
JF - Genes
IS - 7
M1 - 1049
ER -