Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Daehr: a discriminant analysis framework for electronic health record data and an application to early detection of mental health disorders
Xiong H., Zhang J., Huang Y., Leach K., Barnes L. ACM Transactions on Intelligent Systems and Technology8 (3):1-21,2017.Type:Article
Date Reviewed: Jun 1 2017

Xiong et al. present an extension of the linear discriminant analysis (LDA) framework using electronic health record (EHR) data for early disease detection, called Daehr. Two important challenges exist in the conventional LDA model: (a) “it is difficult to train an accurate LDA” when there are few samples available for training; and (b) the data is always heterogeneous with significant noise. To address these issues, Daehr leverages the process of alternating projections with ℓ1-penalized sparse matrix estimation and nearest positive-definite matrix approximation to train the LDA model. Daehr is designed to “(1) eliminate the data noise caused by the manual encoding of EHR data” and (2) lower the variance of parameter (covariance matrices) estimation for LDA models when only a few patients’ EHRs are available for training.

To test the framework, the authors use the College Health Surveillance Network, a large real-world EHR dataset, and evaluate the performance of the proposed framework in “identifying college students at high risk for mental health disorders.” The authors consider different numbers of training samples, and experimental “results demonstrate Daehr significantly outperforms the three baselines (LDA and its derivatives) by achieving 1.4 to 19.4 percent higher accuracy and a 7.5 to 43.5 percent higher F1-score.” Meanwhile, the authors also compare the proposed framework with other predictive models, that is, support vector machines (SVM), logistic regression, and AdaBoost. The results show that compared to other predictive models, Daehr can achieve 2.3 to 19.4 percent higher accuracy and a 7.5 to 43.5 percent higher F1-score. While from the computational complexity point, the computational time of Daehr is largest, Daehr takes longer to train, and the average time consumption to classify a patient using Daehr is fast and similar to LDA.

The proposed Daehr framework can help to address the issues of few training samples and significant dataset noise. Thus, even though Daehr needs much more computation time to train the samples, it is an efficient data mining tool for the future EHR-based early detection of disease.

Reviewer:  Kam-Yiu Lam Review #: CR145315 (1708-0554)
Bookmark and Share
 
Data Mining (H.2.8 ... )
 
 
Life And Medical Sciences (J.3 )
 
Would you recommend this review?
yes
no
Other reviews under "Data Mining": Date
Feature selection and effective classifiers
Deogun J. (ed), Choubey S., Raghavan V. (ed), Sever H. (ed) Journal of the American Society for Information Science 49(5): 423-434, 1998. Type: Article
May 1 1999
Rule induction with extension matrices
Wu X. (ed) Journal of the American Society for Information Science 49(5): 435-454, 1998. Type: Article
Jul 1 1998
Predictive data mining
Weiss S., Indurkhya N., Morgan Kaufmann Publishers Inc., San Francisco, CA, 1998. Type: Book (9781558604032)
Feb 1 1999
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy