A statistical method for studying correlated rare events and their risk factors

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Longitudinal studies of rare events such as cervical high-grade lesions or colorectal polyps that can recur often involve correlated binary data. Risk factor for these events cannot be reliably examined using conventional statistical methods. For example, logistic regression models that incorporate generalized estimating equations often fail to converge or provide inaccurate results when analyzing data of this type. Although exact methods have been reported, they are complex and computationally difficult. The current paper proposes a mathematically straightforward and easy-to-use two-step approach involving (i) an additive model to measure associations between a rare or uncommon correlated binary event and potential risk factors and (ii) a permutation test to estimate the statistical significance of these associations. Simulation studies showed that the proposed method reliably tests and accurately estimates the associations of exposure with correlated binary rare events. This method was then applied to a longitudinal study of human leukocyte antigen (HLA) genotype and risk of cervical high grade squamous intraepithelial lesions (HSIL) among HIV-infected and HIV-uninfected women. Results showed statistically significant associations of two HLA alleles among HIV-negative but not HIV-positive women, suggesting that immune status may modify the HLA and cervical HSIL association. Overall, the proposed method avoids model nonconvergence problems and provides a computationally simple, accurate, and powerful approach for the analysis of risk factor associations with rare/uncommon correlated binary events.

Original languageEnglish (US)
Pages (from-to)1416-1428
Number of pages13
JournalStatistical Methods in Medical Research
Volume26
Issue number3
DOIs
StatePublished - Jun 1 2017

Fingerprint

Rare Events
Risk Factors
Statistical method
Leukocytes
HLA Antigens
HIV
Longitudinal Study
Binary
Longitudinal Studies
Logistic Models
Correlated Binary Data
Association Measure
Permutation Test
Generalized Estimating Equations
Additive Models
Logistic Regression Model
Exact Method
Statistical Significance
Polyps
Inaccurate

Keywords

  • Correlated data
  • exact method
  • generalized estimating equation
  • permutation
  • rare events

ASJC Scopus subject areas

  • Epidemiology
  • Statistics and Probability
  • Health Information Management

Cite this

A statistical method for studying correlated rare events and their risk factors. / Xue, Xiaonan (Nan); Kim, Mimi; Wang, Tao; Kuniholm, Mark H.; Strickler, Howard.

In: Statistical Methods in Medical Research, Vol. 26, No. 3, 01.06.2017, p. 1416-1428.

Research output: Contribution to journalArticle

@article{a583764190ea4544a0d74138d8244c7d,
title = "A statistical method for studying correlated rare events and their risk factors",
abstract = "Longitudinal studies of rare events such as cervical high-grade lesions or colorectal polyps that can recur often involve correlated binary data. Risk factor for these events cannot be reliably examined using conventional statistical methods. For example, logistic regression models that incorporate generalized estimating equations often fail to converge or provide inaccurate results when analyzing data of this type. Although exact methods have been reported, they are complex and computationally difficult. The current paper proposes a mathematically straightforward and easy-to-use two-step approach involving (i) an additive model to measure associations between a rare or uncommon correlated binary event and potential risk factors and (ii) a permutation test to estimate the statistical significance of these associations. Simulation studies showed that the proposed method reliably tests and accurately estimates the associations of exposure with correlated binary rare events. This method was then applied to a longitudinal study of human leukocyte antigen (HLA) genotype and risk of cervical high grade squamous intraepithelial lesions (HSIL) among HIV-infected and HIV-uninfected women. Results showed statistically significant associations of two HLA alleles among HIV-negative but not HIV-positive women, suggesting that immune status may modify the HLA and cervical HSIL association. Overall, the proposed method avoids model nonconvergence problems and provides a computationally simple, accurate, and powerful approach for the analysis of risk factor associations with rare/uncommon correlated binary events.",
keywords = "Correlated data, exact method, generalized estimating equation, permutation, rare events",
author = "Xue, {Xiaonan (Nan)} and Mimi Kim and Tao Wang and Kuniholm, {Mark H.} and Howard Strickler",
year = "2017",
month = "6",
day = "1",
doi = "10.1177/0962280215581112",
language = "English (US)",
volume = "26",
pages = "1416--1428",
journal = "Statistical Methods in Medical Research",
issn = "0962-2802",
publisher = "SAGE Publications Ltd",
number = "3",

}

TY - JOUR

T1 - A statistical method for studying correlated rare events and their risk factors

AU - Xue, Xiaonan (Nan)

AU - Kim, Mimi

AU - Wang, Tao

AU - Kuniholm, Mark H.

AU - Strickler, Howard

PY - 2017/6/1

Y1 - 2017/6/1

N2 - Longitudinal studies of rare events such as cervical high-grade lesions or colorectal polyps that can recur often involve correlated binary data. Risk factor for these events cannot be reliably examined using conventional statistical methods. For example, logistic regression models that incorporate generalized estimating equations often fail to converge or provide inaccurate results when analyzing data of this type. Although exact methods have been reported, they are complex and computationally difficult. The current paper proposes a mathematically straightforward and easy-to-use two-step approach involving (i) an additive model to measure associations between a rare or uncommon correlated binary event and potential risk factors and (ii) a permutation test to estimate the statistical significance of these associations. Simulation studies showed that the proposed method reliably tests and accurately estimates the associations of exposure with correlated binary rare events. This method was then applied to a longitudinal study of human leukocyte antigen (HLA) genotype and risk of cervical high grade squamous intraepithelial lesions (HSIL) among HIV-infected and HIV-uninfected women. Results showed statistically significant associations of two HLA alleles among HIV-negative but not HIV-positive women, suggesting that immune status may modify the HLA and cervical HSIL association. Overall, the proposed method avoids model nonconvergence problems and provides a computationally simple, accurate, and powerful approach for the analysis of risk factor associations with rare/uncommon correlated binary events.

AB - Longitudinal studies of rare events such as cervical high-grade lesions or colorectal polyps that can recur often involve correlated binary data. Risk factor for these events cannot be reliably examined using conventional statistical methods. For example, logistic regression models that incorporate generalized estimating equations often fail to converge or provide inaccurate results when analyzing data of this type. Although exact methods have been reported, they are complex and computationally difficult. The current paper proposes a mathematically straightforward and easy-to-use two-step approach involving (i) an additive model to measure associations between a rare or uncommon correlated binary event and potential risk factors and (ii) a permutation test to estimate the statistical significance of these associations. Simulation studies showed that the proposed method reliably tests and accurately estimates the associations of exposure with correlated binary rare events. This method was then applied to a longitudinal study of human leukocyte antigen (HLA) genotype and risk of cervical high grade squamous intraepithelial lesions (HSIL) among HIV-infected and HIV-uninfected women. Results showed statistically significant associations of two HLA alleles among HIV-negative but not HIV-positive women, suggesting that immune status may modify the HLA and cervical HSIL association. Overall, the proposed method avoids model nonconvergence problems and provides a computationally simple, accurate, and powerful approach for the analysis of risk factor associations with rare/uncommon correlated binary events.

KW - Correlated data

KW - exact method

KW - generalized estimating equation

KW - permutation

KW - rare events

UR - http://www.scopus.com/inward/record.url?scp=85020753023&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85020753023&partnerID=8YFLogxK

U2 - 10.1177/0962280215581112

DO - 10.1177/0962280215581112

M3 - Article

C2 - 25854937

AN - SCOPUS:85020753023

VL - 26

SP - 1416

EP - 1428

JO - Statistical Methods in Medical Research

JF - Statistical Methods in Medical Research

SN - 0962-2802

IS - 3

ER -