Marginal and mixed-effects models in the analysis of human papillomavirus natural history data

Xiaonan Xue, Stephen J. Gange, Ye Zhong, Robert D. Burk, Howard Minkoff, L. Stewart Massad, D. Heather Watts, Mark H. Kuniholm, Kathryn Anastos, Alexandra M. Levine, Melissa Fazzari, Gypsyamber D'Souza, Michael Plankey, Joel M. Palefsky, Howard D. Strickler

Research output: Contribution to journalArticle

64 Citations (Scopus)

Abstract

Human papillomavirus (HPV) natural history has several characteristics that, at least from a statistical perspective, are not often encountered elsewhere in infectious disease and cancer research. There are, for example, multiple HPV types, and infection by each HPV type may be considered separate events. Although concurrent infections are common, the prevalence, incidence, and duration/persistence of each individual HPV can be separately measured. However, repeated measures involving the same subject tend to be correlated. The probability of detecting any given HPV type, for example, is greater among individuals who are currently positive for at least one other HPV type. Serial testing for HPVover time represents a second form of repeated measures. Statistical inferences that fail to take these correlations into account would be invalid. However, methods that do not use all the data would be inefficient. Marginal and mixed-effects models can address these issues but are not frequently used in HPV research. The current study provides an overview of these methods and then uses HPV data from a cohort of HIV-positive women to illustrate how they may be applied, and compare their results. The findings show the greater efficiency of these models compared with standard logistic regression and Cox models. Because mixed-effects models estimate subject-specific associations, they sometimes gave much higher effect estimates than marginal models, which estimate population-averaged associations. Overall, the results show that marginal and mixed-effects models are efficient for studying HPV natural history, but also highlight the importance of understanding how these models differ.

Original languageEnglish (US)
Pages (from-to)159-169
Number of pages11
JournalCancer Epidemiology Biomarkers and Prevention
Volume19
Issue number1
DOIs
StatePublished - Jan 2010

Fingerprint

Natural History
Logistic Models
Papillomavirus Infections
Proportional Hazards Models
Research
Communicable Diseases
HIV
Incidence
Infection
Population

ASJC Scopus subject areas

  • Epidemiology
  • Oncology

Cite this

Marginal and mixed-effects models in the analysis of human papillomavirus natural history data. / Xue, Xiaonan; Gange, Stephen J.; Zhong, Ye; Burk, Robert D.; Minkoff, Howard; Massad, L. Stewart; Watts, D. Heather; Kuniholm, Mark H.; Anastos, Kathryn; Levine, Alexandra M.; Fazzari, Melissa; D'Souza, Gypsyamber; Plankey, Michael; Palefsky, Joel M.; Strickler, Howard D.

In: Cancer Epidemiology Biomarkers and Prevention, Vol. 19, No. 1, 01.2010, p. 159-169.

Research output: Contribution to journalArticle

Xue, X, Gange, SJ, Zhong, Y, Burk, RD, Minkoff, H, Massad, LS, Watts, DH, Kuniholm, MH, Anastos, K, Levine, AM, Fazzari, M, D'Souza, G, Plankey, M, Palefsky, JM & Strickler, HD 2010, 'Marginal and mixed-effects models in the analysis of human papillomavirus natural history data', Cancer Epidemiology Biomarkers and Prevention, vol. 19, no. 1, pp. 159-169. https://doi.org/10.1158/1055-9965.EPI-09-0546
Xue, Xiaonan ; Gange, Stephen J. ; Zhong, Ye ; Burk, Robert D. ; Minkoff, Howard ; Massad, L. Stewart ; Watts, D. Heather ; Kuniholm, Mark H. ; Anastos, Kathryn ; Levine, Alexandra M. ; Fazzari, Melissa ; D'Souza, Gypsyamber ; Plankey, Michael ; Palefsky, Joel M. ; Strickler, Howard D. / Marginal and mixed-effects models in the analysis of human papillomavirus natural history data. In: Cancer Epidemiology Biomarkers and Prevention. 2010 ; Vol. 19, No. 1. pp. 159-169.
@article{d72e13bfdb294d7bb9ff30cc3c7a007f,
title = "Marginal and mixed-effects models in the analysis of human papillomavirus natural history data",
abstract = "Human papillomavirus (HPV) natural history has several characteristics that, at least from a statistical perspective, are not often encountered elsewhere in infectious disease and cancer research. There are, for example, multiple HPV types, and infection by each HPV type may be considered separate events. Although concurrent infections are common, the prevalence, incidence, and duration/persistence of each individual HPV can be separately measured. However, repeated measures involving the same subject tend to be correlated. The probability of detecting any given HPV type, for example, is greater among individuals who are currently positive for at least one other HPV type. Serial testing for HPVover time represents a second form of repeated measures. Statistical inferences that fail to take these correlations into account would be invalid. However, methods that do not use all the data would be inefficient. Marginal and mixed-effects models can address these issues but are not frequently used in HPV research. The current study provides an overview of these methods and then uses HPV data from a cohort of HIV-positive women to illustrate how they may be applied, and compare their results. The findings show the greater efficiency of these models compared with standard logistic regression and Cox models. Because mixed-effects models estimate subject-specific associations, they sometimes gave much higher effect estimates than marginal models, which estimate population-averaged associations. Overall, the results show that marginal and mixed-effects models are efficient for studying HPV natural history, but also highlight the importance of understanding how these models differ.",
author = "Xiaonan Xue and Gange, {Stephen J.} and Ye Zhong and Burk, {Robert D.} and Howard Minkoff and Massad, {L. Stewart} and Watts, {D. Heather} and Kuniholm, {Mark H.} and Kathryn Anastos and Levine, {Alexandra M.} and Melissa Fazzari and Gypsyamber D'Souza and Michael Plankey and Palefsky, {Joel M.} and Strickler, {Howard D.}",
year = "2010",
month = "1",
doi = "10.1158/1055-9965.EPI-09-0546",
language = "English (US)",
volume = "19",
pages = "159--169",
journal = "Cancer Epidemiology Biomarkers and Prevention",
issn = "1055-9965",
publisher = "American Association for Cancer Research Inc.",
number = "1",

}

TY - JOUR

T1 - Marginal and mixed-effects models in the analysis of human papillomavirus natural history data

AU - Xue, Xiaonan

AU - Gange, Stephen J.

AU - Zhong, Ye

AU - Burk, Robert D.

AU - Minkoff, Howard

AU - Massad, L. Stewart

AU - Watts, D. Heather

AU - Kuniholm, Mark H.

AU - Anastos, Kathryn

AU - Levine, Alexandra M.

AU - Fazzari, Melissa

AU - D'Souza, Gypsyamber

AU - Plankey, Michael

AU - Palefsky, Joel M.

AU - Strickler, Howard D.

PY - 2010/1

Y1 - 2010/1

N2 - Human papillomavirus (HPV) natural history has several characteristics that, at least from a statistical perspective, are not often encountered elsewhere in infectious disease and cancer research. There are, for example, multiple HPV types, and infection by each HPV type may be considered separate events. Although concurrent infections are common, the prevalence, incidence, and duration/persistence of each individual HPV can be separately measured. However, repeated measures involving the same subject tend to be correlated. The probability of detecting any given HPV type, for example, is greater among individuals who are currently positive for at least one other HPV type. Serial testing for HPVover time represents a second form of repeated measures. Statistical inferences that fail to take these correlations into account would be invalid. However, methods that do not use all the data would be inefficient. Marginal and mixed-effects models can address these issues but are not frequently used in HPV research. The current study provides an overview of these methods and then uses HPV data from a cohort of HIV-positive women to illustrate how they may be applied, and compare their results. The findings show the greater efficiency of these models compared with standard logistic regression and Cox models. Because mixed-effects models estimate subject-specific associations, they sometimes gave much higher effect estimates than marginal models, which estimate population-averaged associations. Overall, the results show that marginal and mixed-effects models are efficient for studying HPV natural history, but also highlight the importance of understanding how these models differ.

AB - Human papillomavirus (HPV) natural history has several characteristics that, at least from a statistical perspective, are not often encountered elsewhere in infectious disease and cancer research. There are, for example, multiple HPV types, and infection by each HPV type may be considered separate events. Although concurrent infections are common, the prevalence, incidence, and duration/persistence of each individual HPV can be separately measured. However, repeated measures involving the same subject tend to be correlated. The probability of detecting any given HPV type, for example, is greater among individuals who are currently positive for at least one other HPV type. Serial testing for HPVover time represents a second form of repeated measures. Statistical inferences that fail to take these correlations into account would be invalid. However, methods that do not use all the data would be inefficient. Marginal and mixed-effects models can address these issues but are not frequently used in HPV research. The current study provides an overview of these methods and then uses HPV data from a cohort of HIV-positive women to illustrate how they may be applied, and compare their results. The findings show the greater efficiency of these models compared with standard logistic regression and Cox models. Because mixed-effects models estimate subject-specific associations, they sometimes gave much higher effect estimates than marginal models, which estimate population-averaged associations. Overall, the results show that marginal and mixed-effects models are efficient for studying HPV natural history, but also highlight the importance of understanding how these models differ.

UR - http://www.scopus.com/inward/record.url?scp=74549157136&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=74549157136&partnerID=8YFLogxK

U2 - 10.1158/1055-9965.EPI-09-0546

DO - 10.1158/1055-9965.EPI-09-0546

M3 - Article

C2 - 20056635

AN - SCOPUS:74549157136

VL - 19

SP - 159

EP - 169

JO - Cancer Epidemiology Biomarkers and Prevention

JF - Cancer Epidemiology Biomarkers and Prevention

SN - 1055-9965

IS - 1

ER -