Using link-preserving imputation for logistic partially linear models with missing covariates

Qixuan Chen, Myunghee Cho Paik, Minjin Kim, Cuiling Wang

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

To handle missing data one needs to specify auxiliary models such as the probability of observation or imputation model. Doubly robust (DR) method uses both auxiliary models and produces consistent estimation when either of the model is correctly specified. While the DR method in estimating equation approaches could be easy to implement in the case of missing outcomes, it is computationally cumbersome in the case of missing covariates especially in the context of semiparametric regression models. In this paper, we propose a new kernel-assisted estimating equation method for logistic partially linear models with missing covariates. We replace the conditional expectation in the DR estimating function with an unbiased estimating function constructed using the conditional mean of the outcome given the observed data, and impute the missing covariates using the so called link-preserving imputation models to simplify the estimation. The proposed method is valid when the response model is correctly specified and is more efficient than the kernel-assisted inverse probability weighting estimator by Liang (2008). The proposed estimator is consistent and asymptotically normal. We evaluate the finite sample performance in terms of efficiency and robustness, and illustrate the application of the proposed method to the health insurance data using the 2011-2012 National Health and Nutrition Examination Survey, in which data were collected in two phases and some covariates were partially missing in the second phase.

Original languageEnglish (US)
Pages (from-to)174-185
Number of pages12
JournalComputational Statistics and Data Analysis
Volume101
DOIs
StatePublished - Sep 1 2016

Fingerprint

Partially Linear Model
Missing Covariates
Imputation
Logistics
Estimating Function
Estimating Equation
Robust Methods
Health
Inverse Probability Weighting
Model
kernel
Semiparametric Regression Model
Estimator
Consistent Estimation
Nutrition
Conditional Expectation
Health insurance
Missing Data
Insurance
Covariates

Keywords

  • Doubly robust estimator
  • Inverse probability weighting
  • Kernel-assisted estimating equation
  • Link-preserving imputation
  • Logistic partially linear models
  • Missing covariates

ASJC Scopus subject areas

  • Computational Mathematics
  • Computational Theory and Mathematics
  • Statistics and Probability
  • Applied Mathematics

Cite this

Using link-preserving imputation for logistic partially linear models with missing covariates. / Chen, Qixuan; Paik, Myunghee Cho; Kim, Minjin; Wang, Cuiling.

In: Computational Statistics and Data Analysis, Vol. 101, 01.09.2016, p. 174-185.

Research output: Contribution to journalArticle

@article{428161671fc94a299de228192e75398e,
title = "Using link-preserving imputation for logistic partially linear models with missing covariates",
abstract = "To handle missing data one needs to specify auxiliary models such as the probability of observation or imputation model. Doubly robust (DR) method uses both auxiliary models and produces consistent estimation when either of the model is correctly specified. While the DR method in estimating equation approaches could be easy to implement in the case of missing outcomes, it is computationally cumbersome in the case of missing covariates especially in the context of semiparametric regression models. In this paper, we propose a new kernel-assisted estimating equation method for logistic partially linear models with missing covariates. We replace the conditional expectation in the DR estimating function with an unbiased estimating function constructed using the conditional mean of the outcome given the observed data, and impute the missing covariates using the so called link-preserving imputation models to simplify the estimation. The proposed method is valid when the response model is correctly specified and is more efficient than the kernel-assisted inverse probability weighting estimator by Liang (2008). The proposed estimator is consistent and asymptotically normal. We evaluate the finite sample performance in terms of efficiency and robustness, and illustrate the application of the proposed method to the health insurance data using the 2011-2012 National Health and Nutrition Examination Survey, in which data were collected in two phases and some covariates were partially missing in the second phase.",
keywords = "Doubly robust estimator, Inverse probability weighting, Kernel-assisted estimating equation, Link-preserving imputation, Logistic partially linear models, Missing covariates",
author = "Qixuan Chen and Paik, {Myunghee Cho} and Minjin Kim and Cuiling Wang",
year = "2016",
month = "9",
day = "1",
doi = "10.1016/j.csda.2016.03.004",
language = "English (US)",
volume = "101",
pages = "174--185",
journal = "Computational Statistics and Data Analysis",
issn = "0167-9473",
publisher = "Elsevier",

}

TY - JOUR

T1 - Using link-preserving imputation for logistic partially linear models with missing covariates

AU - Chen, Qixuan

AU - Paik, Myunghee Cho

AU - Kim, Minjin

AU - Wang, Cuiling

PY - 2016/9/1

Y1 - 2016/9/1

N2 - To handle missing data one needs to specify auxiliary models such as the probability of observation or imputation model. Doubly robust (DR) method uses both auxiliary models and produces consistent estimation when either of the model is correctly specified. While the DR method in estimating equation approaches could be easy to implement in the case of missing outcomes, it is computationally cumbersome in the case of missing covariates especially in the context of semiparametric regression models. In this paper, we propose a new kernel-assisted estimating equation method for logistic partially linear models with missing covariates. We replace the conditional expectation in the DR estimating function with an unbiased estimating function constructed using the conditional mean of the outcome given the observed data, and impute the missing covariates using the so called link-preserving imputation models to simplify the estimation. The proposed method is valid when the response model is correctly specified and is more efficient than the kernel-assisted inverse probability weighting estimator by Liang (2008). The proposed estimator is consistent and asymptotically normal. We evaluate the finite sample performance in terms of efficiency and robustness, and illustrate the application of the proposed method to the health insurance data using the 2011-2012 National Health and Nutrition Examination Survey, in which data were collected in two phases and some covariates were partially missing in the second phase.

AB - To handle missing data one needs to specify auxiliary models such as the probability of observation or imputation model. Doubly robust (DR) method uses both auxiliary models and produces consistent estimation when either of the model is correctly specified. While the DR method in estimating equation approaches could be easy to implement in the case of missing outcomes, it is computationally cumbersome in the case of missing covariates especially in the context of semiparametric regression models. In this paper, we propose a new kernel-assisted estimating equation method for logistic partially linear models with missing covariates. We replace the conditional expectation in the DR estimating function with an unbiased estimating function constructed using the conditional mean of the outcome given the observed data, and impute the missing covariates using the so called link-preserving imputation models to simplify the estimation. The proposed method is valid when the response model is correctly specified and is more efficient than the kernel-assisted inverse probability weighting estimator by Liang (2008). The proposed estimator is consistent and asymptotically normal. We evaluate the finite sample performance in terms of efficiency and robustness, and illustrate the application of the proposed method to the health insurance data using the 2011-2012 National Health and Nutrition Examination Survey, in which data were collected in two phases and some covariates were partially missing in the second phase.

KW - Doubly robust estimator

KW - Inverse probability weighting

KW - Kernel-assisted estimating equation

KW - Link-preserving imputation

KW - Logistic partially linear models

KW - Missing covariates

UR - http://www.scopus.com/inward/record.url?scp=84962306234&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84962306234&partnerID=8YFLogxK

U2 - 10.1016/j.csda.2016.03.004

DO - 10.1016/j.csda.2016.03.004

M3 - Article

VL - 101

SP - 174

EP - 185

JO - Computational Statistics and Data Analysis

JF - Computational Statistics and Data Analysis

SN - 0167-9473

ER -