Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Towards big data Bayesian network learning -- an ensemble learning based approach
Tang Y., Wang Y., Cooper K., Li L.  BigData Congress (Proceedings of the 2014 IEEE International Congress on Big Data,355-357.2014.Type:Proceedings
Date Reviewed: May 6 2015

A Bayesian network (BN) is a directed probabilistic graph model that is used to model variable dependency relationships. Over 50 learning algorithms exist for BNs. This paper proposes a big data-focused BN model learning algorithm: the parallel ensemble-based Bayesian network learning algorithm (PENBays). PENBays combines a new arc score for data quality assessment prior to being modeled via BN, ensemble learning, or the distributed computing model. The datasets that are suitable for BN learning are determined to be the ones with an arc score larger than -0.5. Five large datasets (more than 500 million rows) are used to show that PENBays outperforms three other BN model learning algorithms: max-min-hill-climbing (MMHC), three-phase dependency analysis algorithm (TPDA), and REC. The structure hamming distance (SHD) metric is used for comparison. SHD is the number of edge insertions, deletions, or flips necessary to transform one graph into another graph.

A comparison of performance with respect to execution time and predictive accuracy is missing and would have been good information to add to this study. In addition, the impact of the low arc score on the algorithm performance would be useful to determine how important the score is to the overall performance. The key points are introduced succinctly in this paper before being used.

I noticed a typographical error: “In contrary, data set which yields had BN structure has very low Formula, generally smaller than -0.5, or sometimes -1 or even -2. Hence, Formula could be used as an suitable measure for data set quality.” This should read: “On the contrary, the data set that yields bad BN structure has very low Formula, generally smaller than -0.5, or sometimes -1 or even -2. Hence, Formula could be used as a suitable measure for data set quality.”

Reviewer:  Pragyansmita Nayak Review #: CR143417 (1508-0727)
Bookmark and Share
  Reviewer Selected
Featured Reviewer
 
 
Knowledge Representation Formalisms And Methods (I.2.4 )
 
 
Learning (I.2.6 )
 
 
Problem Solving, Control Methods, And Search (I.2.8 )
 
Would you recommend this review?
yes
no
Other reviews under "Knowledge Representation Formalisms And Methods": Date
Knowledge representation: an approach to artificial intelligence
Bench-Capon T., Academic Press Prof., Inc., San Diego, CA, 1990. Type: Book (9780120864409)
Jul 1 1991
Truth and modality for knowledge representation
Turner R., MIT Press, Cambridge, MA, 1991. Type: Book (9780262200806)
Nov 1 1991
Constraint relaxation may be perfect
Montanari U., Rossi F. (ed) Artificial Intelligence 48(2): 143-170, 1991. Type: Article
Aug 1 1992
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy