### Abstract

Bias in parameter estimates can be substantial when heteroscedastic normal mixtures are misspecified as homoscedastic normal mixtures, and vice versa. We show through simulations that the maximum likelihood estimators under the false assumption of equal variances are inconsistent and bias in parameter estimates is appreciable and even substantial when the mixture components are not well-separated. Finite sample bias in parameter estimates is close to the asymptotic bias even for a sample size of 200 or less. When homoscedastic normal mixtures are misspecified as heteroscedastic normal mixtures, the maximum likelihood estimators are consistent. However, the maximum likelihood estimators under a correctly specified homoscedastic mixture model converge to the true parameter values faster than those under a misspecified heteroscedastic mixture model. The bias of the maximum likelihood estimators is less dependent on the lower bound imposed on the component variances to ensure that the likelihood is bounded under the false assumption of unequal variances when the sample size is 500 or more and the component distributions are well-separated. An example is given to demonstrate the effects of a misspecification of the component variances on estimates of the prevalence of hypertension using normal mixtures.

Original language | English (US) |
---|---|

Pages (from-to) | 2739-2747 |

Number of pages | 9 |

Journal | Computational Statistics and Data Analysis |

Volume | 55 |

Issue number | 9 |

DOIs | |

State | Published - Sep 1 2011 |

### Fingerprint

### Keywords

- Asymptotic bias
- Bootstrap
- EM algorithm
- Normal mixture
- Systolic blood pressure

### ASJC Scopus subject areas

- Computational Mathematics
- Computational Theory and Mathematics
- Statistics and Probability
- Applied Mathematics

### Cite this

**Bias from misspecification of the component variances in a normal mixture.** / Lo, Yungtai.

Research output: Contribution to journal › Article

*Computational Statistics and Data Analysis*, vol. 55, no. 9, pp. 2739-2747. https://doi.org/10.1016/j.csda.2011.04.007

}

TY - JOUR

T1 - Bias from misspecification of the component variances in a normal mixture

AU - Lo, Yungtai

PY - 2011/9/1

Y1 - 2011/9/1

N2 - Bias in parameter estimates can be substantial when heteroscedastic normal mixtures are misspecified as homoscedastic normal mixtures, and vice versa. We show through simulations that the maximum likelihood estimators under the false assumption of equal variances are inconsistent and bias in parameter estimates is appreciable and even substantial when the mixture components are not well-separated. Finite sample bias in parameter estimates is close to the asymptotic bias even for a sample size of 200 or less. When homoscedastic normal mixtures are misspecified as heteroscedastic normal mixtures, the maximum likelihood estimators are consistent. However, the maximum likelihood estimators under a correctly specified homoscedastic mixture model converge to the true parameter values faster than those under a misspecified heteroscedastic mixture model. The bias of the maximum likelihood estimators is less dependent on the lower bound imposed on the component variances to ensure that the likelihood is bounded under the false assumption of unequal variances when the sample size is 500 or more and the component distributions are well-separated. An example is given to demonstrate the effects of a misspecification of the component variances on estimates of the prevalence of hypertension using normal mixtures.

AB - Bias in parameter estimates can be substantial when heteroscedastic normal mixtures are misspecified as homoscedastic normal mixtures, and vice versa. We show through simulations that the maximum likelihood estimators under the false assumption of equal variances are inconsistent and bias in parameter estimates is appreciable and even substantial when the mixture components are not well-separated. Finite sample bias in parameter estimates is close to the asymptotic bias even for a sample size of 200 or less. When homoscedastic normal mixtures are misspecified as heteroscedastic normal mixtures, the maximum likelihood estimators are consistent. However, the maximum likelihood estimators under a correctly specified homoscedastic mixture model converge to the true parameter values faster than those under a misspecified heteroscedastic mixture model. The bias of the maximum likelihood estimators is less dependent on the lower bound imposed on the component variances to ensure that the likelihood is bounded under the false assumption of unequal variances when the sample size is 500 or more and the component distributions are well-separated. An example is given to demonstrate the effects of a misspecification of the component variances on estimates of the prevalence of hypertension using normal mixtures.

KW - Asymptotic bias

KW - Bootstrap

KW - EM algorithm

KW - Normal mixture

KW - Systolic blood pressure

UR - http://www.scopus.com/inward/record.url?scp=79956130569&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79956130569&partnerID=8YFLogxK

U2 - 10.1016/j.csda.2011.04.007

DO - 10.1016/j.csda.2011.04.007

M3 - Article

AN - SCOPUS:79956130569

VL - 55

SP - 2739

EP - 2747

JO - Computational Statistics and Data Analysis

JF - Computational Statistics and Data Analysis

SN - 0167-9473

IS - 9

ER -