Computing Reviews, the leading online review service for computing literature.

Search

Towards big data Bayesian network learning -- an ensemble learning based approach
Tang Y., Wang Y., Cooper K., Li L. BigData Congress (Proceedings of the 2014 IEEE International Congress on Big Data,355-357.2014.Type:Proceedings

Date Reviewed: May 6 2015

A Bayesian network (BN) is a directed probabilistic graph model that is used to model variable dependency relationships. Over 50 learning algorithms exist for BNs. This paper proposes a big data-focused BN model learning algorithm: the parallel ensemble-based Bayesian network learning algorithm (PENBays). PENBays combines a new arc score for data quality assessment prior to being modeled via BN, ensemble learning, or the distributed computing model. The datasets that are suitable for BN learning are determined to be the ones with an arc score larger than -0.5. Five large datasets (more than 500 million rows) are used to show that PENBays outperforms three other BN model learning algorithms: max-min-hill-climbing (MMHC), three-phase dependency analysis algorithm (TPDA), and REC. The structure hamming distance (SHD) metric is used for comparison. SHD is the number of edge insertions, deletions, or flips necessary to transform one graph into another graph. A comparison of performance with respect to execution time and predictive accuracy is missing and would have been good information to add to this study. In addition, the impact of the low arc score on the algorithm performance would be useful to determine how important the score is to the overall performance. The key points are introduced succinctly in this paper before being used. I noticed a typographical error: “In contrary, data set which yields had BN structure has very low Formula, generally smaller than -0.5, or sometimes -1 or even -2. Hence, Formula could be used as an suitable measure for data set quality.” This should read: “On the contrary, the data set that yields bad BN structure has very low Formula, generally smaller than -0.5, or sometimes -1 or even -2. Hence, Formula could be used as a suitable measure for data set quality.”

Reviewer: Pragyansmita Nayak	Review #: CR143417 (1508-0727)

Knowledge Representation Formalisms And Methods (I.2.4 )

Learning (I.2.6 )

Problem Solving, Control Methods, And Search (I.2.8 )

Would you recommend this review?

yes

Other reviews under "Knowledge Representation Formalisms And Methods":	Date

Knowledge representation: an approach to artificial intelligence Bench-Capon T., Academic Press Prof., Inc., San Diego, CA, 1990. Type: Book (9780120864409)	Jul 1 1991

Truth and modality for knowledge representation Turner R., MIT Press, Cambridge, MA, 1991. Type: Book (9780262200806)	Nov 1 1991

Constraint relaxation may be perfect Montanari U., Rossi F. (ed) Artificial Intelligence 48(2): 143-170, 1991. Type: Article	Aug 1 1992

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy