A partial least-square approach for modeling gene-gene and gene-environment interactions when multiple markers are genotyped

Tao Wang, Gloria Ho, Qian K. Ye, Howard Strickler, Robert C. Elston

Research output: Contribution to journalArticle

30 Citations (Scopus)

Abstract

Genetic association studies achieve an unprecedented level of resolution in mapping disease genes by genotyping dense single nucleotype polymorphisms (SNPs) in a gene region. Meanwhile, these studies require new powerful statistical tools that can optimally handle a large amount of information provided by genotype data. A question that arises is how to model interactions between two genes. Simply modeling all possible interactions between the SNPs in two gene regions is not desirable because a greatly increased number of degrees of freedom can be involved in the test statistic. We introduce an approach to reduce the genotype dimension in modeling interactions. The genotype compression of this approach is built upon the information on both the trait and the cross-locus gametic disequilibrium between SNPs in two interacting genes, in such a way as to parsimoniously model the interactions without loss of useful information in the process of dimension reduction. As a result, it improves power to detect association in the presence of gene-gene interactions. This approach can be similarly applied for modeling gene-environment interactions. We compare this method with other approaches, the corresponding test without modeling any interaction, that based on a saturated interaction model, that based on principal component analysis, and that based on Tukey's one-degree-of-freedom model. Our simulations suggest that this new approach has superior power to that of the other methods. In an application to endometrial cancer case-control data from the Women's Health Initiative, this approach detected AKT1 and AKT2 as being significantly associated with endometrial cancer susceptibility by taking into account their interactions with body mass index.

Original languageEnglish (US)
Pages (from-to)6-15
Number of pages10
JournalGenetic Epidemiology
Volume33
Issue number1
DOIs
StatePublished - 2009

Fingerprint

Gene-Environment Interaction
Least-Squares Analysis
Genes
Genotype
Endometrial Neoplasms
Chromosome Mapping
Genetic Association Studies
Women's Health
Principal Component Analysis
Body Mass Index

Keywords

  • Endometrial cancer
  • Environment
  • Gene
  • Genetic association analysis
  • Interaction

ASJC Scopus subject areas

  • Genetics(clinical)
  • Epidemiology

Cite this

@article{72809b4fedd746699e7bc11b4ec3cfc5,
title = "A partial least-square approach for modeling gene-gene and gene-environment interactions when multiple markers are genotyped",
abstract = "Genetic association studies achieve an unprecedented level of resolution in mapping disease genes by genotyping dense single nucleotype polymorphisms (SNPs) in a gene region. Meanwhile, these studies require new powerful statistical tools that can optimally handle a large amount of information provided by genotype data. A question that arises is how to model interactions between two genes. Simply modeling all possible interactions between the SNPs in two gene regions is not desirable because a greatly increased number of degrees of freedom can be involved in the test statistic. We introduce an approach to reduce the genotype dimension in modeling interactions. The genotype compression of this approach is built upon the information on both the trait and the cross-locus gametic disequilibrium between SNPs in two interacting genes, in such a way as to parsimoniously model the interactions without loss of useful information in the process of dimension reduction. As a result, it improves power to detect association in the presence of gene-gene interactions. This approach can be similarly applied for modeling gene-environment interactions. We compare this method with other approaches, the corresponding test without modeling any interaction, that based on a saturated interaction model, that based on principal component analysis, and that based on Tukey's one-degree-of-freedom model. Our simulations suggest that this new approach has superior power to that of the other methods. In an application to endometrial cancer case-control data from the Women's Health Initiative, this approach detected AKT1 and AKT2 as being significantly associated with endometrial cancer susceptibility by taking into account their interactions with body mass index.",
keywords = "Endometrial cancer, Environment, Gene, Genetic association analysis, Interaction",
author = "Tao Wang and Gloria Ho and Ye, {Qian K.} and Howard Strickler and Elston, {Robert C.}",
year = "2009",
doi = "10.1002/gepi.20351",
language = "English (US)",
volume = "33",
pages = "6--15",
journal = "Genetic Epidemiology",
issn = "0741-0395",
publisher = "Wiley-Liss Inc.",
number = "1",

}

TY - JOUR

T1 - A partial least-square approach for modeling gene-gene and gene-environment interactions when multiple markers are genotyped

AU - Wang, Tao

AU - Ho, Gloria

AU - Ye, Qian K.

AU - Strickler, Howard

AU - Elston, Robert C.

PY - 2009

Y1 - 2009

N2 - Genetic association studies achieve an unprecedented level of resolution in mapping disease genes by genotyping dense single nucleotype polymorphisms (SNPs) in a gene region. Meanwhile, these studies require new powerful statistical tools that can optimally handle a large amount of information provided by genotype data. A question that arises is how to model interactions between two genes. Simply modeling all possible interactions between the SNPs in two gene regions is not desirable because a greatly increased number of degrees of freedom can be involved in the test statistic. We introduce an approach to reduce the genotype dimension in modeling interactions. The genotype compression of this approach is built upon the information on both the trait and the cross-locus gametic disequilibrium between SNPs in two interacting genes, in such a way as to parsimoniously model the interactions without loss of useful information in the process of dimension reduction. As a result, it improves power to detect association in the presence of gene-gene interactions. This approach can be similarly applied for modeling gene-environment interactions. We compare this method with other approaches, the corresponding test without modeling any interaction, that based on a saturated interaction model, that based on principal component analysis, and that based on Tukey's one-degree-of-freedom model. Our simulations suggest that this new approach has superior power to that of the other methods. In an application to endometrial cancer case-control data from the Women's Health Initiative, this approach detected AKT1 and AKT2 as being significantly associated with endometrial cancer susceptibility by taking into account their interactions with body mass index.

AB - Genetic association studies achieve an unprecedented level of resolution in mapping disease genes by genotyping dense single nucleotype polymorphisms (SNPs) in a gene region. Meanwhile, these studies require new powerful statistical tools that can optimally handle a large amount of information provided by genotype data. A question that arises is how to model interactions between two genes. Simply modeling all possible interactions between the SNPs in two gene regions is not desirable because a greatly increased number of degrees of freedom can be involved in the test statistic. We introduce an approach to reduce the genotype dimension in modeling interactions. The genotype compression of this approach is built upon the information on both the trait and the cross-locus gametic disequilibrium between SNPs in two interacting genes, in such a way as to parsimoniously model the interactions without loss of useful information in the process of dimension reduction. As a result, it improves power to detect association in the presence of gene-gene interactions. This approach can be similarly applied for modeling gene-environment interactions. We compare this method with other approaches, the corresponding test without modeling any interaction, that based on a saturated interaction model, that based on principal component analysis, and that based on Tukey's one-degree-of-freedom model. Our simulations suggest that this new approach has superior power to that of the other methods. In an application to endometrial cancer case-control data from the Women's Health Initiative, this approach detected AKT1 and AKT2 as being significantly associated with endometrial cancer susceptibility by taking into account their interactions with body mass index.

KW - Endometrial cancer

KW - Environment

KW - Gene

KW - Genetic association analysis

KW - Interaction

UR - http://www.scopus.com/inward/record.url?scp=65849377200&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=65849377200&partnerID=8YFLogxK

U2 - 10.1002/gepi.20351

DO - 10.1002/gepi.20351

M3 - Article

C2 - 18615621

AN - SCOPUS:65849377200

VL - 33

SP - 6

EP - 15

JO - Genetic Epidemiology

JF - Genetic Epidemiology

SN - 0741-0395

IS - 1

ER -