Repeated measures regression in laboratory, clinical and environmental research: Common misconceptions in the matter of different within-and between-subject slopes

Donald R. Hoover; Qiuhu Shi; Igor Burstyn; Kathryn Anastos

doi:10.3390/ijerph16030504

Repeated measures regression in laboratory, clinical and environmental research: Common misconceptions in the matter of different within-and between-subject slopes

Donald R. Hoover, Qiuhu Shi, Igor Burstyn, Kathryn Anastos

Medicine

Research output: Contribution to journal › Article › peer-review

7 Scopus citations

Abstract

When using repeated measures linear regression models to make causal inference in laboratory, clinical and environmental research, it is typically assumed that the within-subject association of differences (or changes) in predictor variable values across replicates is the same as the between-subject association of differences in those predictor variable values. However, this is often false. For example, with body weight as the predictor variable and blood cholesterol (which increases with higher body fat) as the outcome: (i) a 10-lb. weight increase in the same adult affects more greatly an increase in cholesterol in that adult than does (ii) one adult weighing 10 lbs. more than a second indicate higher cholesterol in the heavier adult. A 10-lb. weight gain in the first adult more likely reflects a build-up of body fat in that person, while a second person being 10 lbs. heavier than the first could be influenced by other factors, such as the second person being taller. Hence, to make causal inferences, different within-and between-subject slopes should be separately modeled. A related misconception commonly made using generalized estimation equations (GEE) and mixed models on repeated measures (i.e., for fitting cross-sectional regression) is that the working correlation structure only influences variance of the parameter estimates. However, only independence working correlation guarantees that the modeled parameters have interpretability. We illustrate this with an example where changing the working correlation from independence to equicorrelation qualitatively biases parameters of GEE models and show that this happens because within-and between-subject slopes for the outcomes regressed on the predictor variables differ. We then systematically describe several common mechanisms that cause within-and between-subject slopes to differ: change effects, lag/reverse-lag and spillover causality, shared within-subject measurement bias or confounding, and predictor variable measurement error. The misconceptions we describe should be better publicized. Repeated measures analyses should compare within-and between-subject slopes of predictors and when they do differ, investigate the causal reasons for this.

Original language	English (US)
Article number	504
Journal	International journal of environmental research and public health
Volume	16
Issue number	3
DOIs	https://doi.org/10.3390/ijerph16030504
State	Published - Feb 1 2019

Keywords

Cross-sectional regression
Generalized estimating equations
Mixed models
Repeated measures
Within-/between-subject associations
Working correlation structure

ASJC Scopus subject areas

Public Health, Environmental and Occupational Health
Pollution
Health, Toxicology and Mutagenesis

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

10.3390/ijerph16030504

Cite this

Repeated measures regression in laboratory, clinical and environmental research: Common misconceptions in the matter of different within-and between-subject slopes. / Hoover, Donald R.; Shi, Qiuhu; Burstyn, Igor et al.
In: International journal of environmental research and public health, Vol. 16, No. 3, 504, 01.02.2019.

Research output: Contribution to journal › Article › peer-review

@article{59aa1c3f8e1140e094f1f1c91358a535,

title = "Repeated measures regression in laboratory, clinical and environmental research: Common misconceptions in the matter of different within-and between-subject slopes",

abstract = "When using repeated measures linear regression models to make causal inference in laboratory, clinical and environmental research, it is typically assumed that the within-subject association of differences (or changes) in predictor variable values across replicates is the same as the between-subject association of differences in those predictor variable values. However, this is often false. For example, with body weight as the predictor variable and blood cholesterol (which increases with higher body fat) as the outcome: (i) a 10-lb. weight increase in the same adult affects more greatly an increase in cholesterol in that adult than does (ii) one adult weighing 10 lbs. more than a second indicate higher cholesterol in the heavier adult. A 10-lb. weight gain in the first adult more likely reflects a build-up of body fat in that person, while a second person being 10 lbs. heavier than the first could be influenced by other factors, such as the second person being taller. Hence, to make causal inferences, different within-and between-subject slopes should be separately modeled. A related misconception commonly made using generalized estimation equations (GEE) and mixed models on repeated measures (i.e., for fitting cross-sectional regression) is that the working correlation structure only influences variance of the parameter estimates. However, only independence working correlation guarantees that the modeled parameters have interpretability. We illustrate this with an example where changing the working correlation from independence to equicorrelation qualitatively biases parameters of GEE models and show that this happens because within-and between-subject slopes for the outcomes regressed on the predictor variables differ. We then systematically describe several common mechanisms that cause within-and between-subject slopes to differ: change effects, lag/reverse-lag and spillover causality, shared within-subject measurement bias or confounding, and predictor variable measurement error. The misconceptions we describe should be better publicized. Repeated measures analyses should compare within-and between-subject slopes of predictors and when they do differ, investigate the causal reasons for this.",

keywords = "Cross-sectional regression, Generalized estimating equations, Mixed models, Repeated measures, Within-/between-subject associations, Working correlation structure",

author = "Hoover, {Donald R.} and Qiuhu Shi and Igor Burstyn and Kathryn Anastos",

note = "Publisher Copyright: {\textcopyright} 2019 by the authors. Licensee MDPI, Basel, Switzerland.",

year = "2019",

month = feb,

day = "1",

doi = "10.3390/ijerph16030504",

language = "English (US)",

volume = "16",

journal = "International journal of environmental research and public health",

issn = "1661-7827",

publisher = "Multidisciplinary Digital Publishing Institute (MDPI)",

number = "3",

}

TY - JOUR

T1 - Repeated measures regression in laboratory, clinical and environmental research

T2 - Common misconceptions in the matter of different within-and between-subject slopes

AU - Hoover, Donald R.

AU - Shi, Qiuhu

AU - Burstyn, Igor

AU - Anastos, Kathryn

PY - 2019/2/1

Y1 - 2019/2/1

N2 - When using repeated measures linear regression models to make causal inference in laboratory, clinical and environmental research, it is typically assumed that the within-subject association of differences (or changes) in predictor variable values across replicates is the same as the between-subject association of differences in those predictor variable values. However, this is often false. For example, with body weight as the predictor variable and blood cholesterol (which increases with higher body fat) as the outcome: (i) a 10-lb. weight increase in the same adult affects more greatly an increase in cholesterol in that adult than does (ii) one adult weighing 10 lbs. more than a second indicate higher cholesterol in the heavier adult. A 10-lb. weight gain in the first adult more likely reflects a build-up of body fat in that person, while a second person being 10 lbs. heavier than the first could be influenced by other factors, such as the second person being taller. Hence, to make causal inferences, different within-and between-subject slopes should be separately modeled. A related misconception commonly made using generalized estimation equations (GEE) and mixed models on repeated measures (i.e., for fitting cross-sectional regression) is that the working correlation structure only influences variance of the parameter estimates. However, only independence working correlation guarantees that the modeled parameters have interpretability. We illustrate this with an example where changing the working correlation from independence to equicorrelation qualitatively biases parameters of GEE models and show that this happens because within-and between-subject slopes for the outcomes regressed on the predictor variables differ. We then systematically describe several common mechanisms that cause within-and between-subject slopes to differ: change effects, lag/reverse-lag and spillover causality, shared within-subject measurement bias or confounding, and predictor variable measurement error. The misconceptions we describe should be better publicized. Repeated measures analyses should compare within-and between-subject slopes of predictors and when they do differ, investigate the causal reasons for this.

AB - When using repeated measures linear regression models to make causal inference in laboratory, clinical and environmental research, it is typically assumed that the within-subject association of differences (or changes) in predictor variable values across replicates is the same as the between-subject association of differences in those predictor variable values. However, this is often false. For example, with body weight as the predictor variable and blood cholesterol (which increases with higher body fat) as the outcome: (i) a 10-lb. weight increase in the same adult affects more greatly an increase in cholesterol in that adult than does (ii) one adult weighing 10 lbs. more than a second indicate higher cholesterol in the heavier adult. A 10-lb. weight gain in the first adult more likely reflects a build-up of body fat in that person, while a second person being 10 lbs. heavier than the first could be influenced by other factors, such as the second person being taller. Hence, to make causal inferences, different within-and between-subject slopes should be separately modeled. A related misconception commonly made using generalized estimation equations (GEE) and mixed models on repeated measures (i.e., for fitting cross-sectional regression) is that the working correlation structure only influences variance of the parameter estimates. However, only independence working correlation guarantees that the modeled parameters have interpretability. We illustrate this with an example where changing the working correlation from independence to equicorrelation qualitatively biases parameters of GEE models and show that this happens because within-and between-subject slopes for the outcomes regressed on the predictor variables differ. We then systematically describe several common mechanisms that cause within-and between-subject slopes to differ: change effects, lag/reverse-lag and spillover causality, shared within-subject measurement bias or confounding, and predictor variable measurement error. The misconceptions we describe should be better publicized. Repeated measures analyses should compare within-and between-subject slopes of predictors and when they do differ, investigate the causal reasons for this.

KW - Cross-sectional regression

KW - Generalized estimating equations

KW - Mixed models

KW - Repeated measures

KW - Within-/between-subject associations

KW - Working correlation structure

UR - http://www.scopus.com/inward/record.url?scp=85061484017&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85061484017&partnerID=8YFLogxK

U2 - 10.3390/ijerph16030504

DO - 10.3390/ijerph16030504

M3 - Article

C2 - 30754731

AN - SCOPUS:85061484017

SN - 1661-7827

VL - 16

JO - International journal of environmental research and public health

JF - International journal of environmental research and public health

IS - 3

M1 - 504

ER -

Repeated measures regression in laboratory, clinical and environmental research: Common misconceptions in the matter of different within-and between-subject slopes

Abstract

Keywords

ASJC Scopus subject areas

UN SDGs

Access to Document

Other files and links

Fingerprint

Cite this