A comparison of power analysis methods for evaluating effects of a predictor on slopes in longitudinal designs with missing data

Research output: Contribution to journalArticle

Abstract

In many longitudinal studies, evaluating the effect of a binary or continuous predictor variable on the rate of change of the outcome, i.e. slope, is often of primary interest. Sample size determination of these studies, however, is complicated by the expectation that missing data will occur due to missed visits, early drop out, and staggered entry. Despite the availability of methods for assessing power in longitudinal studies with missing data, the impact on power of the magnitude and distribution of missing data in the study population remain poorly understood. As a result, simple but erroneous alterations of the sample size formulae for complete/balanced data are commonly applied. These 'naive' approaches include the average sum of squares and average number of subjects methods. The goal of this article is to explore in greater detail the effect of missing data on study power and compare the performance of naive sample size methods to a correct maximum likelihood-based method using both mathematical and simulation-based approaches. Two different longitudinal aging studies are used to illustrate the methods.

Original languageEnglish (US)
Pages (from-to)1009-1029
Number of pages21
JournalStatistical Methods in Medical Research
Volume24
Issue number6
DOIs
StatePublished - Dec 1 2015

Fingerprint

Power Analysis
Missing Data
Predictors
Slope
Sample Size
Longitudinal Studies
Longitudinal Study
Sample Size Determination
Rate of change
Drop out
Sum of squares
Maximum Likelihood
Availability
Design
Binary
Population
Simulation

Keywords

  • compound symmetry
  • intraclass correlation
  • linear mixed effects model
  • monotone missing
  • sample size

ASJC Scopus subject areas

  • Epidemiology
  • Health Information Management
  • Statistics and Probability

Cite this

@article{74bd76abd16044d0a497839e66aaed18,
title = "A comparison of power analysis methods for evaluating effects of a predictor on slopes in longitudinal designs with missing data",
abstract = "In many longitudinal studies, evaluating the effect of a binary or continuous predictor variable on the rate of change of the outcome, i.e. slope, is often of primary interest. Sample size determination of these studies, however, is complicated by the expectation that missing data will occur due to missed visits, early drop out, and staggered entry. Despite the availability of methods for assessing power in longitudinal studies with missing data, the impact on power of the magnitude and distribution of missing data in the study population remain poorly understood. As a result, simple but erroneous alterations of the sample size formulae for complete/balanced data are commonly applied. These 'naive' approaches include the average sum of squares and average number of subjects methods. The goal of this article is to explore in greater detail the effect of missing data on study power and compare the performance of naive sample size methods to a correct maximum likelihood-based method using both mathematical and simulation-based approaches. Two different longitudinal aging studies are used to illustrate the methods.",
keywords = "compound symmetry, intraclass correlation, linear mixed effects model, monotone missing, sample size",
author = "Cuiling Wang and Hall, {Charles B.} and Mimi Kim",
year = "2015",
month = "12",
day = "1",
doi = "10.1177/0962280212437452",
language = "English (US)",
volume = "24",
pages = "1009--1029",
journal = "Statistical Methods in Medical Research",
issn = "0962-2802",
publisher = "SAGE Publications Ltd",
number = "6",

}

TY - JOUR

T1 - A comparison of power analysis methods for evaluating effects of a predictor on slopes in longitudinal designs with missing data

AU - Wang, Cuiling

AU - Hall, Charles B.

AU - Kim, Mimi

PY - 2015/12/1

Y1 - 2015/12/1

N2 - In many longitudinal studies, evaluating the effect of a binary or continuous predictor variable on the rate of change of the outcome, i.e. slope, is often of primary interest. Sample size determination of these studies, however, is complicated by the expectation that missing data will occur due to missed visits, early drop out, and staggered entry. Despite the availability of methods for assessing power in longitudinal studies with missing data, the impact on power of the magnitude and distribution of missing data in the study population remain poorly understood. As a result, simple but erroneous alterations of the sample size formulae for complete/balanced data are commonly applied. These 'naive' approaches include the average sum of squares and average number of subjects methods. The goal of this article is to explore in greater detail the effect of missing data on study power and compare the performance of naive sample size methods to a correct maximum likelihood-based method using both mathematical and simulation-based approaches. Two different longitudinal aging studies are used to illustrate the methods.

AB - In many longitudinal studies, evaluating the effect of a binary or continuous predictor variable on the rate of change of the outcome, i.e. slope, is often of primary interest. Sample size determination of these studies, however, is complicated by the expectation that missing data will occur due to missed visits, early drop out, and staggered entry. Despite the availability of methods for assessing power in longitudinal studies with missing data, the impact on power of the magnitude and distribution of missing data in the study population remain poorly understood. As a result, simple but erroneous alterations of the sample size formulae for complete/balanced data are commonly applied. These 'naive' approaches include the average sum of squares and average number of subjects methods. The goal of this article is to explore in greater detail the effect of missing data on study power and compare the performance of naive sample size methods to a correct maximum likelihood-based method using both mathematical and simulation-based approaches. Two different longitudinal aging studies are used to illustrate the methods.

KW - compound symmetry

KW - intraclass correlation

KW - linear mixed effects model

KW - monotone missing

KW - sample size

UR - http://www.scopus.com/inward/record.url?scp=84948427407&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84948427407&partnerID=8YFLogxK

U2 - 10.1177/0962280212437452

DO - 10.1177/0962280212437452

M3 - Article

VL - 24

SP - 1009

EP - 1029

JO - Statistical Methods in Medical Research

JF - Statistical Methods in Medical Research

SN - 0962-2802

IS - 6

ER -