Population structure of hispanics in the United States: The multi-Ethnic study of Atherosclerosis

Ani Manichaikul; Walter Palmas; Carlos J. Rodriguez; Carmen A. Peralta; Jasmin Divers; Xiuqing Guo; Wei Min Chen; Quenna Wong; Kayleen Williams; Kathleen F. Kerr; Kent D. Taylor; Michael Y. Tsai; Mark O. Goodarzi; Michèle M. Sale; Ana V. Diez-Roux; Stephen S. Rich; Jerome I. Rotter; Josyf C. Mychaleckyj

doi:10.1371/journal.pgen.1002640

Population structure of hispanics in the United States: The multi-Ethnic study of Atherosclerosis

Ani Manichaikul, Walter Palmas, Carlos J. Rodriguez, Carmen A. Peralta, Jasmin Divers, Xiuqing Guo, Wei Min Chen, Quenna Wong, Kayleen Williams, Kathleen F. Kerr, Kent D. Taylor, Michael Y. Tsai, Mark O. Goodarzi, Michèle M. Sale, Ana V. Diez-Roux, Stephen S. Rich, Jerome I. Rotter, Josyf C. Mychaleckyj

Research output: Contribution to journal › Article › peer-review

68 Scopus citations

Abstract

Using ~60,000 SNPs selected for minimal linkage disequilibrium, we perform population structure analysis of 1,374 unrelated Hispanic individuals from the Multi-Ethnic Study of Atherosclerosis (MESA), with self-identification corresponding to Central America (n = 93), Cuba (n = 50), the Dominican Republic (n = 203), Mexico (n = 708), Puerto Rico (n = 192), and South America (n = 111). By projection of principal components (PCs) of ancestry to samples from the HapMap phase III and the Human Genome Diversity Panel (HGDP), we show the first two PCs quantify the Caucasian, African, and Native American origins, while the third and fourth PCs bring out an axis that aligns with known South-to-North geographic location of HGDP Native American samples and further separates MESA Mexican versus Central/South American samples along the same axis. Using k-means clustering computed from the first four PCs, we define four subgroups of the MESA Hispanic cohort that show close agreement with self-identification, labeling the clusters as primarily Dominican/Cuban, Mexican, Central/South American, and Puerto Rican. To demonstrate our recommendations for genetic analysis in the MESA Hispanic cohort, we present pooled and stratified association analysis of triglycerides for selected SNPs in the LPL and TRIB1 gene regions, previously reported in GWAS of triglycerides in Caucasians but as yet unconfirmed in Hispanic populations. We report statistically significant evidence for genetic association in both genes, and we further demonstrate the importance of considering population substructure and genetic heterogeneity in genetic association studies performed in the United States Hispanic population.

Original language	English (US)
Article number	e1002640
Journal	PLoS genetics
Volume	8
Issue number	4
DOIs	https://doi.org/10.1371/journal.pgen.1002640
State	Published - Apr 2012
Externally published	Yes

ASJC Scopus subject areas

Ecology, Evolution, Behavior and Systematics
Molecular Biology
Genetics
Genetics(clinical)
Cancer Research

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.1371/journal.pgen.1002640

Cite this

Manichaikul, A., Palmas, W., Rodriguez, C. J., Peralta, C. A., Divers, J., Guo, X., Chen, W. M., Wong, Q., Williams, K., Kerr, K. F., Taylor, K. D., Tsai, M. Y., Goodarzi, M. O., Sale, M. M., Diez-Roux, A. V., Rich, S. S., Rotter, J. I., & Mychaleckyj, J. C. (2012). Population structure of hispanics in the United States: The multi-Ethnic study of Atherosclerosis. PLoS genetics, 8(4), Article e1002640. https://doi.org/10.1371/journal.pgen.1002640

Manichaikul, A, Palmas, W, Rodriguez, CJ, Peralta, CA, Divers, J, Guo, X, Chen, WM, Wong, Q, Williams, K, Kerr, KF, Taylor, KD, Tsai, MY, Goodarzi, MO, Sale, MM, Diez-Roux, AV, Rich, SS, Rotter, JI & Mychaleckyj, JC 2012, 'Population structure of hispanics in the United States: The multi-Ethnic study of Atherosclerosis', PLoS genetics, vol. 8, no. 4, e1002640. https://doi.org/10.1371/journal.pgen.1002640

@article{a299dab7b3c0493daabf5237dbec3b36,

title = "Population structure of hispanics in the United States: The multi-Ethnic study of Atherosclerosis",

abstract = "Using ~60,000 SNPs selected for minimal linkage disequilibrium, we perform population structure analysis of 1,374 unrelated Hispanic individuals from the Multi-Ethnic Study of Atherosclerosis (MESA), with self-identification corresponding to Central America (n = 93), Cuba (n = 50), the Dominican Republic (n = 203), Mexico (n = 708), Puerto Rico (n = 192), and South America (n = 111). By projection of principal components (PCs) of ancestry to samples from the HapMap phase III and the Human Genome Diversity Panel (HGDP), we show the first two PCs quantify the Caucasian, African, and Native American origins, while the third and fourth PCs bring out an axis that aligns with known South-to-North geographic location of HGDP Native American samples and further separates MESA Mexican versus Central/South American samples along the same axis. Using k-means clustering computed from the first four PCs, we define four subgroups of the MESA Hispanic cohort that show close agreement with self-identification, labeling the clusters as primarily Dominican/Cuban, Mexican, Central/South American, and Puerto Rican. To demonstrate our recommendations for genetic analysis in the MESA Hispanic cohort, we present pooled and stratified association analysis of triglycerides for selected SNPs in the LPL and TRIB1 gene regions, previously reported in GWAS of triglycerides in Caucasians but as yet unconfirmed in Hispanic populations. We report statistically significant evidence for genetic association in both genes, and we further demonstrate the importance of considering population substructure and genetic heterogeneity in genetic association studies performed in the United States Hispanic population.",

author = "Ani Manichaikul and Walter Palmas and Rodriguez, {Carlos J.} and Peralta, {Carmen A.} and Jasmin Divers and Xiuqing Guo and Chen, {Wei Min} and Quenna Wong and Kayleen Williams and Kerr, {Kathleen F.} and Taylor, {Kent D.} and Tsai, {Michael Y.} and Goodarzi, {Mark O.} and Sale, {Mich{\`e}le M.} and Diez-Roux, {Ana V.} and Rich, {Stephen S.} and Rotter, {Jerome I.} and Mychaleckyj, {Josyf C.}",

year = "2012",

month = apr,

doi = "10.1371/journal.pgen.1002640",

language = "English (US)",

volume = "8",

journal = "PLoS genetics",

issn = "1553-7390",

publisher = "Public Library of Science",

number = "4",

}

TY - JOUR

T1 - Population structure of hispanics in the United States

T2 - The multi-Ethnic study of Atherosclerosis

AU - Manichaikul, Ani

AU - Palmas, Walter

AU - Rodriguez, Carlos J.

AU - Peralta, Carmen A.

AU - Divers, Jasmin

AU - Guo, Xiuqing

AU - Chen, Wei Min

AU - Wong, Quenna

AU - Williams, Kayleen

AU - Kerr, Kathleen F.

AU - Taylor, Kent D.

AU - Tsai, Michael Y.

AU - Goodarzi, Mark O.

AU - Sale, Michèle M.

AU - Diez-Roux, Ana V.

AU - Rich, Stephen S.

AU - Rotter, Jerome I.

AU - Mychaleckyj, Josyf C.

PY - 2012/4

Y1 - 2012/4

N2 - Using ~60,000 SNPs selected for minimal linkage disequilibrium, we perform population structure analysis of 1,374 unrelated Hispanic individuals from the Multi-Ethnic Study of Atherosclerosis (MESA), with self-identification corresponding to Central America (n = 93), Cuba (n = 50), the Dominican Republic (n = 203), Mexico (n = 708), Puerto Rico (n = 192), and South America (n = 111). By projection of principal components (PCs) of ancestry to samples from the HapMap phase III and the Human Genome Diversity Panel (HGDP), we show the first two PCs quantify the Caucasian, African, and Native American origins, while the third and fourth PCs bring out an axis that aligns with known South-to-North geographic location of HGDP Native American samples and further separates MESA Mexican versus Central/South American samples along the same axis. Using k-means clustering computed from the first four PCs, we define four subgroups of the MESA Hispanic cohort that show close agreement with self-identification, labeling the clusters as primarily Dominican/Cuban, Mexican, Central/South American, and Puerto Rican. To demonstrate our recommendations for genetic analysis in the MESA Hispanic cohort, we present pooled and stratified association analysis of triglycerides for selected SNPs in the LPL and TRIB1 gene regions, previously reported in GWAS of triglycerides in Caucasians but as yet unconfirmed in Hispanic populations. We report statistically significant evidence for genetic association in both genes, and we further demonstrate the importance of considering population substructure and genetic heterogeneity in genetic association studies performed in the United States Hispanic population.

AB - Using ~60,000 SNPs selected for minimal linkage disequilibrium, we perform population structure analysis of 1,374 unrelated Hispanic individuals from the Multi-Ethnic Study of Atherosclerosis (MESA), with self-identification corresponding to Central America (n = 93), Cuba (n = 50), the Dominican Republic (n = 203), Mexico (n = 708), Puerto Rico (n = 192), and South America (n = 111). By projection of principal components (PCs) of ancestry to samples from the HapMap phase III and the Human Genome Diversity Panel (HGDP), we show the first two PCs quantify the Caucasian, African, and Native American origins, while the third and fourth PCs bring out an axis that aligns with known South-to-North geographic location of HGDP Native American samples and further separates MESA Mexican versus Central/South American samples along the same axis. Using k-means clustering computed from the first four PCs, we define four subgroups of the MESA Hispanic cohort that show close agreement with self-identification, labeling the clusters as primarily Dominican/Cuban, Mexican, Central/South American, and Puerto Rican. To demonstrate our recommendations for genetic analysis in the MESA Hispanic cohort, we present pooled and stratified association analysis of triglycerides for selected SNPs in the LPL and TRIB1 gene regions, previously reported in GWAS of triglycerides in Caucasians but as yet unconfirmed in Hispanic populations. We report statistically significant evidence for genetic association in both genes, and we further demonstrate the importance of considering population substructure and genetic heterogeneity in genetic association studies performed in the United States Hispanic population.

UR - http://www.scopus.com/inward/record.url?scp=84860571639&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84860571639&partnerID=8YFLogxK

U2 - 10.1371/journal.pgen.1002640

DO - 10.1371/journal.pgen.1002640

M3 - Article

C2 - 22511882

AN - SCOPUS:84860571639

SN - 1553-7390

VL - 8

JO - PLoS genetics

JF - PLoS genetics

IS - 4

M1 - e1002640

ER -

Population structure of hispanics in the United States: The multi-Ethnic study of Atherosclerosis

Abstract

ASJC Scopus subject areas

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this