Bias from misspecification of the component variances in a normal mixture

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

Bias in parameter estimates can be substantial when heteroscedastic normal mixtures are misspecified as homoscedastic normal mixtures, and vice versa. We show through simulations that the maximum likelihood estimators under the false assumption of equal variances are inconsistent and bias in parameter estimates is appreciable and even substantial when the mixture components are not well-separated. Finite sample bias in parameter estimates is close to the asymptotic bias even for a sample size of 200 or less. When homoscedastic normal mixtures are misspecified as heteroscedastic normal mixtures, the maximum likelihood estimators are consistent. However, the maximum likelihood estimators under a correctly specified homoscedastic mixture model converge to the true parameter values faster than those under a misspecified heteroscedastic mixture model. The bias of the maximum likelihood estimators is less dependent on the lower bound imposed on the component variances to ensure that the likelihood is bounded under the false assumption of unequal variances when the sample size is 500 or more and the component distributions are well-separated. An example is given to demonstrate the effects of a misspecification of the component variances on estimates of the prevalence of hypertension using normal mixtures.

Original languageEnglish (US)
Pages (from-to)2739-2747
Number of pages9
JournalComputational Statistics and Data Analysis
Volume55
Issue number9
DOIs
StatePublished - Sep 1 2011

Fingerprint

Normal Mixture
Variance Components
Misspecification
Maximum Likelihood Estimator
Maximum likelihood
Mixture Model
Estimate
Sample Size
Heteroscedastic Model
Asymptotic Bias
Hypertension
Unequal
Inconsistent
Likelihood
Lower bound
Converge
Dependent
Demonstrate
Simulation

Keywords

  • Asymptotic bias
  • Bootstrap
  • EM algorithm
  • Normal mixture
  • Systolic blood pressure

ASJC Scopus subject areas

  • Computational Mathematics
  • Computational Theory and Mathematics
  • Statistics and Probability
  • Applied Mathematics

Cite this

Bias from misspecification of the component variances in a normal mixture. / Lo, Yungtai.

In: Computational Statistics and Data Analysis, Vol. 55, No. 9, 01.09.2011, p. 2739-2747.

Research output: Contribution to journalArticle

@article{e5df9c8e26104a11a12c43b0f59abd51,
title = "Bias from misspecification of the component variances in a normal mixture",
abstract = "Bias in parameter estimates can be substantial when heteroscedastic normal mixtures are misspecified as homoscedastic normal mixtures, and vice versa. We show through simulations that the maximum likelihood estimators under the false assumption of equal variances are inconsistent and bias in parameter estimates is appreciable and even substantial when the mixture components are not well-separated. Finite sample bias in parameter estimates is close to the asymptotic bias even for a sample size of 200 or less. When homoscedastic normal mixtures are misspecified as heteroscedastic normal mixtures, the maximum likelihood estimators are consistent. However, the maximum likelihood estimators under a correctly specified homoscedastic mixture model converge to the true parameter values faster than those under a misspecified heteroscedastic mixture model. The bias of the maximum likelihood estimators is less dependent on the lower bound imposed on the component variances to ensure that the likelihood is bounded under the false assumption of unequal variances when the sample size is 500 or more and the component distributions are well-separated. An example is given to demonstrate the effects of a misspecification of the component variances on estimates of the prevalence of hypertension using normal mixtures.",
keywords = "Asymptotic bias, Bootstrap, EM algorithm, Normal mixture, Systolic blood pressure",
author = "Yungtai Lo",
year = "2011",
month = "9",
day = "1",
doi = "10.1016/j.csda.2011.04.007",
language = "English (US)",
volume = "55",
pages = "2739--2747",
journal = "Computational Statistics and Data Analysis",
issn = "0167-9473",
publisher = "Elsevier",
number = "9",

}

TY - JOUR

T1 - Bias from misspecification of the component variances in a normal mixture

AU - Lo, Yungtai

PY - 2011/9/1

Y1 - 2011/9/1

N2 - Bias in parameter estimates can be substantial when heteroscedastic normal mixtures are misspecified as homoscedastic normal mixtures, and vice versa. We show through simulations that the maximum likelihood estimators under the false assumption of equal variances are inconsistent and bias in parameter estimates is appreciable and even substantial when the mixture components are not well-separated. Finite sample bias in parameter estimates is close to the asymptotic bias even for a sample size of 200 or less. When homoscedastic normal mixtures are misspecified as heteroscedastic normal mixtures, the maximum likelihood estimators are consistent. However, the maximum likelihood estimators under a correctly specified homoscedastic mixture model converge to the true parameter values faster than those under a misspecified heteroscedastic mixture model. The bias of the maximum likelihood estimators is less dependent on the lower bound imposed on the component variances to ensure that the likelihood is bounded under the false assumption of unequal variances when the sample size is 500 or more and the component distributions are well-separated. An example is given to demonstrate the effects of a misspecification of the component variances on estimates of the prevalence of hypertension using normal mixtures.

AB - Bias in parameter estimates can be substantial when heteroscedastic normal mixtures are misspecified as homoscedastic normal mixtures, and vice versa. We show through simulations that the maximum likelihood estimators under the false assumption of equal variances are inconsistent and bias in parameter estimates is appreciable and even substantial when the mixture components are not well-separated. Finite sample bias in parameter estimates is close to the asymptotic bias even for a sample size of 200 or less. When homoscedastic normal mixtures are misspecified as heteroscedastic normal mixtures, the maximum likelihood estimators are consistent. However, the maximum likelihood estimators under a correctly specified homoscedastic mixture model converge to the true parameter values faster than those under a misspecified heteroscedastic mixture model. The bias of the maximum likelihood estimators is less dependent on the lower bound imposed on the component variances to ensure that the likelihood is bounded under the false assumption of unequal variances when the sample size is 500 or more and the component distributions are well-separated. An example is given to demonstrate the effects of a misspecification of the component variances on estimates of the prevalence of hypertension using normal mixtures.

KW - Asymptotic bias

KW - Bootstrap

KW - EM algorithm

KW - Normal mixture

KW - Systolic blood pressure

UR - http://www.scopus.com/inward/record.url?scp=79956130569&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79956130569&partnerID=8YFLogxK

U2 - 10.1016/j.csda.2011.04.007

DO - 10.1016/j.csda.2011.04.007

M3 - Article

AN - SCOPUS:79956130569

VL - 55

SP - 2739

EP - 2747

JO - Computational Statistics and Data Analysis

JF - Computational Statistics and Data Analysis

SN - 0167-9473

IS - 9

ER -