Computing Reviews

Discrete Bayesian network classifiers:a survey
Bielza C., Larrañaga P. ACM Computing Surveys47(1):1-43,2014.Type:Article
Date Reviewed: 09/28/16

Bayesian networks (BN) are used to estimate the value of one of the attributes (termed as the predicted or class attribute) of the dataset utilizing the remaining attributes (termed as predictor attribute(s)). The naive Bayes (NB) model, introduced in 1960, is the most generic BN that assumes all the predictor attributes of a dataset are independent of each other. Thirty years later, BN classifiers rose to prominence and have been increasingly applied to real-world problems. BN classifiers extend NB by allowing varying types of acyclic relationships between the different subsets of attributes. Classifiers predict a class attribute based on predictor attributes utilizing the mentioned acyclic relationship. Among the different values of the predicted attribute, the value that has the highest chance of occurrence is chosen as the answer by the formulated Bayes classifier. This value is called the “most probable a posteriori (MAP)” class.

It is easy to interpret and better understand the intricacies of the generated BN model from the data best visualized as a directed acyclic graph (DAG). The confidence measure on the predicted class, ease of handling missing data, and being computationally efficient are the other reasons for the growing popularity and increasing use of BNs. BNs can handle both continuous and discrete-valued attribute cases. The discrete BN classifiers are surveyed in this paper. Different sections cover the different types of BNs: NB; selective NB; semi-naive Bayes; one-dependence Bayesian classifiers (such as tree-augmented network (TAN) Bayes); k-dependence Bayesian classifiers, where k is greater than one; Markov blanket-based solutions; and Bayesian multinets.

NB is the simplest, as noted above. Semi-naive Bayes synthesizes new attributes as a combination of one or more of the existing attributes. One- and k-dependence represents increasing dependence among predictor attributes. One-dependence is when each predictor attribute can depend on only one other predictor (in addition to the predicted attribute) as part of the permitted modeling; k-dependence expands the scope, in contrast to one-dependence, and allows dependency on maximum k other predictor attributes (in addition to the predicted attribute). Extending this concept is the idea of constructing a tree for each possible value of the predicted attribute resulting in a forest, a special case of Bayesian multinets. Bayesian multinets use multiple networks learned from different partitions of a dataset consisting of one or more possible values of the class variable.

The 12 different classifiers from the mentioned categories were tested on the Ljubljana breast cancer dataset consisting of nine predictor variables and 286 instances. WEKA software was used and predictive accuracy estimated with ten-fold stratified cross-validation. TAN was found to be the best-performing model on average (with about 77 percent accuracy). NB and selective NB are the worst (with about 71 percent accuracy). Table 4 (p. 5:34) is a good compilation of the references associated with the different classifiers presented in a hierarchical order.

Reviewer:  Pragyansmita Nayak Review #: CR144792 (1612-0929)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy