Predicting dementia with routine care EMR data

Zina Ben Miled, Kyle Haas, Christopher M. Black, Rezaul Karim Khandker, Vasu Chandrasekaran, Richard Lipton, Malaz A. Boustani

Research output: Contribution to journalArticlepeer-review

21 Scopus citations


Our aim is to develop a machine learning (ML) model that can predict dementia in a general patient population from multiple health care institutions one year and three years prior to the onset of the disease without any additional monitoring or screening. The purpose of the model is to automate the cost-effective, non-invasive, digital pre-screening of patients at risk for dementia. Towards this purpose, routine care data, which is widely available through Electronic Medical Record (EMR) systems is used as a data source. These data embody a rich knowledge and make related medical applications easy to deploy at scale in a cost-effective manner. Specifically, the model is trained by using structured and unstructured data from three EMR data sets: diagnosis, prescriptions, and medical notes. Each of these three data sets is used to construct an individual model along with a combined model which is derived by using all three data sets. Human-interpretable data processing and ML techniques are selected in order to facilitate adoption of the proposed model by health care providers from multiple institutions. The results show that the combined model is generalizable across multiple institutions and is able to predict dementia within one year of its onset with an accuracy of nearly 80% despite the fact that it was trained using routine care data. Moreover, the analysis of the models identified important predictors for dementia. Some of these predictors (e.g., age and hypertensive disorders) are already confirmed by the literature while others, especially the ones derived from the unstructured medical notes, require further clinical analysis.

Original languageEnglish (US)
Article number101771
JournalArtificial Intelligence in Medicine
StatePublished - Jan 2020


  • Dementia
  • EMR
  • Machine learning
  • Prediction
  • Random forest

ASJC Scopus subject areas

  • Medicine (miscellaneous)
  • Artificial Intelligence


Dive into the research topics of 'Predicting dementia with routine care EMR data'. Together they form a unique fingerprint.

Cite this