×

Bayesian network classifiers. (English) Zbl 0892.68077

Summary: Recent work in supervised learning has shown that a surprisingly simple Bayesian classifier with strong assumptions of independence among features, called naive Bayes, is competitive with state-of-the-art classifiers such as C4.5. This fact raises the question of whether a classifier with less restrictive assumptions can perform even better. In this paper we evaluate approaches for inducing classifiers from data, based on the theory of learning Bayesian networks. These networks are factored representations of probability distributions that generalize the naive Bayesian classifier and explicitly represent statements about independence. Among these approaches we single out a method we call tree augmented naive Bayes (TAN), which outperforms naive Bayes, yet at the same time maintains the computational simplicity (no search involved) and robustness that characterize naive Bayes. We experimentally tested these approaches, using problems from the University of California at Irvine repository, and compared them to C4.5, naive Bayes, and wrapper methods for feature selection.

MSC:

68T05 Learning and adaptive systems in artificial intelligence

Software:

C4.5
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Zhang G P. Neural networks for classification: a survey. IEEE Trans Neural Netw, 2000, 30: 117-124
[2] Vapnik V N. Statistical Learning Theory. New York: Springer, 1998 · Zbl 0935.62007
[3] Scott C, Salzberg S. A weighted nearest neighbor algorithm for learning with symbolic features. Mach Learn, 1993, 10: 57-78
[4] Friedman N, Geiger D, Goldszmidt M. Bayesian network classifiers. Mach Learn, 1997, 29: 131-161 · Zbl 0892.68077
[5] Quinlan J R. Induction of decision trees. Mach Learn, 1986, 1: 81-106
[6] Domingos P, Pazzani M. On the optimality of the simple Bayesian classifier under zero-one loss. Mach Learn, 1997, 29: 103-130 · Zbl 0892.68076
[7] Ramoni M, Sebastiani P. Robust Bayes classifiers. Artif Intell, 2001, 125: 209-226 · Zbl 0969.68148
[8] Chow C K, Liu C N. Approximating discrete probability distributions with dependence trees. IEEE Trans Inform Theory, 1968, 14: 462-467 · Zbl 0165.22305
[9] Lam W, Bacchus F. Learning Bayesian belief networks: an approach based on the MDL principle. Comput Intell, 1994, 10: 269-293
[10] Grossman D, Domingos P. Learning Bayesian network classifiers by maximizing conditional likelihood. In: Proceedings of the 21th International Conference on Machine Learning. Banff, Canada, 2004. 361-368
[11] Greiner R, Su X Y, Shen B et al. Structural extension to logistic regression: discriminative parameter learning of belief net classifiers. Mach Learn, 2005, 59: 297-322 · Zbl 1101.68759
[12] Jing Y S, Pavlovi V, Rehg J M. Boosted Bayesian network classifiers. Mach Learn, 2008, 73: 155-184 · Zbl 1470.68121
[13] Pearl J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. San Mateo: Morgan Kaufmann, 1988. 383-408
[14] Verma T S, Pearl J. Equivalence and synthesis of causal models. In: Proceedings of the 6th International Conference on Uncertainty in Artificial Intelligence. New York: Elsevier Science Inc, 1991. 255-270
[15] Sarkar S, Murthy I. Constructing efficient belief network structures with expert provided information. IEEE Trans Knowl Data En, 1996, 8: 183-188
[16] Cooper G F, Herskovits E. A Bayesian method for the induction of probabilistic networks from data. Mach Learn, 1992, 9: 309-347 · Zbl 0766.68109
[17] Heckerman D, Geiger D, Chickering D M. Learning Bayesian networks: the combination of knowledge and statistical data. Mach Learn, 1995, 20: 197-243 · Zbl 0831.68096
[18] Spirtes P, Glymour C, Scheines R. An algorithm for fast recovery of sparse causal graphs. Soc Sci Comput Rev, 1991, 9: 62-72
[19] Cowell R G. When learning Bayesian networks from data, using conditional independence tests is equivalent to a local scoring metric. In: Proceedings of the 17th International Conference on Uncertainty in Artificial Intelligence. San Francisco: Morgan Kaufmann, 2001. 91-97
[20] Cheng J, Greiner R, Kelly J. Learning Bayesian networks from data: an efficient approach based on information-theory. Artif Intell, 2002, 137: 43-90 · Zbl 0995.68114
[21] Cheng J, Greiner R. Comparing Bayesian network classifiers. In: Proceedings of the 15th Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI-99). San Francisco: Morgan Kaufmann, 1999. 101-108
[22] Madden M G. On the classification performance of TAN and general Bayesian networks. Knowl-Based Syst, 2009, 22: 489-495
[23] Drugan M M, Wiering M A. Feature selection for Bayesian network classifiers using the MDL-FS score. Int J Approx Reason, 2010, 51: 695-717 · Zbl 1205.68286
[24] Jiang S, Zhang H. Full Bayesian network classifiers. In: Proceedings of the 23th International Conference on Machine Learning. New York: ACM, 2006. 897-904
[25] Hernndez-Lobato J M, Hernndez-Lobato D, Surez A. Network-based sparse Bayesian classification. Pattern Recogn, 2011, 44: 886-900 · Zbl 1213.68528
[26] Jensen F V. Bayesian Networks and Decision Graphs. New York: Springer, 2001 · Zbl 0973.62005
[27] Wang S C, Yuan S M. Research on learning Bayesian networks structure with missing data(in Chinese). J Sofw, 2004, 15: 1042-1048 · Zbl 1071.62029
[28] Kohavi R. A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence. Montral: Morgan Kaufmann, 1995. 1137-1143
[29] Murphy S L, Aha D W. UCI repository of machine learning databases. http://www.ics.uci.edu/mlearn/MLRepository.Html/2011
[30] Fayyad U,
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.