Comparison of statistical methods for analysis of clustered binary observations

Moonseong Heo, Andrew C. Leon

Research output: Contribution to journalArticle

36 Citations (Scopus)

Abstract

When correlated observations are obtained in a randomized controlled trial, the assumption of independence among observations within cluster likely will not hold because the observations share the same cluster (e.g. clinic, physician, or subject). Further, the outcome measurements of interest are often binary. The objective of this paper is to compare the performance of four statistical methods for analysis of clustered binary observations: namely (1) full likelihood method; (2) penalized quasi-likelihood method; (3) generalized estimating equation method; (4) fixed-effects logistic regression method. The first three methods take correlations into account in inferential processes whereas the last method does not. Type I error rate, power, bias, and standard error are compared across the four statistical methods through computer simulations under varying effect sizes, intraclass correlation coefficients, number of clusters, and number of observations per cluster, including large numbers 20 and 100 of observations per cluster. The results show that the performance of the full likelihood and the penalized quasi-likelihood methods is superior for analysis of clustered binary observations, and is not necessarily inferior to that of the fixed-effects logistic regression fit even when within-cluster correlations are zero.

Original languageEnglish (US)
Pages (from-to)911-923
Number of pages13
JournalStatistics in Medicine
Volume24
Issue number6
DOIs
StatePublished - Mar 30 2005
Externally publishedYes

Fingerprint

Statistical method
Binary
Likelihood Methods
Penalized Quasi-likelihood
Fixed Effects
Logistic Regression
Intraclass Correlation Coefficient
Correlated Observations
Randomized Controlled Trial
Effect Size
Generalized Estimating Equations
Type I Error Rate
Number of Clusters
Standard error
Logistic Models
Observation
Likelihood
Computer Simulation
Likely
Zero

Keywords

  • Bias
  • Binary outcomes
  • Clustered randomized controlled trials
  • Intraclass correlation coefficient
  • Power
  • Type I error rate

ASJC Scopus subject areas

  • Epidemiology

Cite this

Comparison of statistical methods for analysis of clustered binary observations. / Heo, Moonseong; Leon, Andrew C.

In: Statistics in Medicine, Vol. 24, No. 6, 30.03.2005, p. 911-923.

Research output: Contribution to journalArticle

@article{cb7174a4a820457e8250abd3838ebd10,
title = "Comparison of statistical methods for analysis of clustered binary observations",
abstract = "When correlated observations are obtained in a randomized controlled trial, the assumption of independence among observations within cluster likely will not hold because the observations share the same cluster (e.g. clinic, physician, or subject). Further, the outcome measurements of interest are often binary. The objective of this paper is to compare the performance of four statistical methods for analysis of clustered binary observations: namely (1) full likelihood method; (2) penalized quasi-likelihood method; (3) generalized estimating equation method; (4) fixed-effects logistic regression method. The first three methods take correlations into account in inferential processes whereas the last method does not. Type I error rate, power, bias, and standard error are compared across the four statistical methods through computer simulations under varying effect sizes, intraclass correlation coefficients, number of clusters, and number of observations per cluster, including large numbers 20 and 100 of observations per cluster. The results show that the performance of the full likelihood and the penalized quasi-likelihood methods is superior for analysis of clustered binary observations, and is not necessarily inferior to that of the fixed-effects logistic regression fit even when within-cluster correlations are zero.",
keywords = "Bias, Binary outcomes, Clustered randomized controlled trials, Intraclass correlation coefficient, Power, Type I error rate",
author = "Moonseong Heo and Leon, {Andrew C.}",
year = "2005",
month = "3",
day = "30",
doi = "10.1002/sim.1958",
language = "English (US)",
volume = "24",
pages = "911--923",
journal = "Statistics in Medicine",
issn = "0277-6715",
publisher = "John Wiley and Sons Ltd",
number = "6",

}

TY - JOUR

T1 - Comparison of statistical methods for analysis of clustered binary observations

AU - Heo, Moonseong

AU - Leon, Andrew C.

PY - 2005/3/30

Y1 - 2005/3/30

N2 - When correlated observations are obtained in a randomized controlled trial, the assumption of independence among observations within cluster likely will not hold because the observations share the same cluster (e.g. clinic, physician, or subject). Further, the outcome measurements of interest are often binary. The objective of this paper is to compare the performance of four statistical methods for analysis of clustered binary observations: namely (1) full likelihood method; (2) penalized quasi-likelihood method; (3) generalized estimating equation method; (4) fixed-effects logistic regression method. The first three methods take correlations into account in inferential processes whereas the last method does not. Type I error rate, power, bias, and standard error are compared across the four statistical methods through computer simulations under varying effect sizes, intraclass correlation coefficients, number of clusters, and number of observations per cluster, including large numbers 20 and 100 of observations per cluster. The results show that the performance of the full likelihood and the penalized quasi-likelihood methods is superior for analysis of clustered binary observations, and is not necessarily inferior to that of the fixed-effects logistic regression fit even when within-cluster correlations are zero.

AB - When correlated observations are obtained in a randomized controlled trial, the assumption of independence among observations within cluster likely will not hold because the observations share the same cluster (e.g. clinic, physician, or subject). Further, the outcome measurements of interest are often binary. The objective of this paper is to compare the performance of four statistical methods for analysis of clustered binary observations: namely (1) full likelihood method; (2) penalized quasi-likelihood method; (3) generalized estimating equation method; (4) fixed-effects logistic regression method. The first three methods take correlations into account in inferential processes whereas the last method does not. Type I error rate, power, bias, and standard error are compared across the four statistical methods through computer simulations under varying effect sizes, intraclass correlation coefficients, number of clusters, and number of observations per cluster, including large numbers 20 and 100 of observations per cluster. The results show that the performance of the full likelihood and the penalized quasi-likelihood methods is superior for analysis of clustered binary observations, and is not necessarily inferior to that of the fixed-effects logistic regression fit even when within-cluster correlations are zero.

KW - Bias

KW - Binary outcomes

KW - Clustered randomized controlled trials

KW - Intraclass correlation coefficient

KW - Power

KW - Type I error rate

UR - http://www.scopus.com/inward/record.url?scp=15544362262&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=15544362262&partnerID=8YFLogxK

U2 - 10.1002/sim.1958

DO - 10.1002/sim.1958

M3 - Article

VL - 24

SP - 911

EP - 923

JO - Statistics in Medicine

JF - Statistics in Medicine

SN - 0277-6715

IS - 6

ER -