TY - JOUR
T1 - Predicting dementia with routine care EMR data
AU - Ben Miled, Zina
AU - Haas, Kyle
AU - Black, Christopher M.
AU - Khandker, Rezaul Karim
AU - Chandrasekaran, Vasu
AU - Lipton, Richard
AU - Boustani, Malaz A.
N1 - Publisher Copyright:
© 2019 Elsevier B.V.
PY - 2020/1
Y1 - 2020/1
N2 - Our aim is to develop a machine learning (ML) model that can predict dementia in a general patient population from multiple health care institutions one year and three years prior to the onset of the disease without any additional monitoring or screening. The purpose of the model is to automate the cost-effective, non-invasive, digital pre-screening of patients at risk for dementia. Towards this purpose, routine care data, which is widely available through Electronic Medical Record (EMR) systems is used as a data source. These data embody a rich knowledge and make related medical applications easy to deploy at scale in a cost-effective manner. Specifically, the model is trained by using structured and unstructured data from three EMR data sets: diagnosis, prescriptions, and medical notes. Each of these three data sets is used to construct an individual model along with a combined model which is derived by using all three data sets. Human-interpretable data processing and ML techniques are selected in order to facilitate adoption of the proposed model by health care providers from multiple institutions. The results show that the combined model is generalizable across multiple institutions and is able to predict dementia within one year of its onset with an accuracy of nearly 80% despite the fact that it was trained using routine care data. Moreover, the analysis of the models identified important predictors for dementia. Some of these predictors (e.g., age and hypertensive disorders) are already confirmed by the literature while others, especially the ones derived from the unstructured medical notes, require further clinical analysis.
AB - Our aim is to develop a machine learning (ML) model that can predict dementia in a general patient population from multiple health care institutions one year and three years prior to the onset of the disease without any additional monitoring or screening. The purpose of the model is to automate the cost-effective, non-invasive, digital pre-screening of patients at risk for dementia. Towards this purpose, routine care data, which is widely available through Electronic Medical Record (EMR) systems is used as a data source. These data embody a rich knowledge and make related medical applications easy to deploy at scale in a cost-effective manner. Specifically, the model is trained by using structured and unstructured data from three EMR data sets: diagnosis, prescriptions, and medical notes. Each of these three data sets is used to construct an individual model along with a combined model which is derived by using all three data sets. Human-interpretable data processing and ML techniques are selected in order to facilitate adoption of the proposed model by health care providers from multiple institutions. The results show that the combined model is generalizable across multiple institutions and is able to predict dementia within one year of its onset with an accuracy of nearly 80% despite the fact that it was trained using routine care data. Moreover, the analysis of the models identified important predictors for dementia. Some of these predictors (e.g., age and hypertensive disorders) are already confirmed by the literature while others, especially the ones derived from the unstructured medical notes, require further clinical analysis.
KW - Dementia
KW - EMR
KW - Machine learning
KW - Prediction
KW - Random forest
UR - http://www.scopus.com/inward/record.url?scp=85076191922&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85076191922&partnerID=8YFLogxK
U2 - 10.1016/j.artmed.2019.101771
DO - 10.1016/j.artmed.2019.101771
M3 - Article
C2 - 31980108
AN - SCOPUS:85076191922
SN - 0933-3657
VL - 102
JO - Artificial Intelligence in Medicine
JF - Artificial Intelligence in Medicine
M1 - 101771
ER -