[en] We consider algorithms for generating Mixtures of Bagged Markov Trees, for density estimation. In problems defined over many variables and when few observations are available, those mixtures generally outperform a single Markov tree maximizing the data likelihood, but are far more expensive to compute. In this paper, we describe new algorithms for approximating such models, with the aim of speeding up learning without sacrificing accuracy. More specifically, we propose to use a filtering step obtained as a by-product from computing a first Markov tree, so as to avoid considering poor candidate edges in the subsequently generated trees. We compare these algorithms (on synthetic data sets) to Mixtures of Bagged Markov Trees, as well as to a single Markov tree derived by the classical Chow-Liu algorithm and to a recently proposed randomized scheme used for building tree mixtures.
Research Center/Unit :
Systèmes et Modélisation
Disciplines :
Computer science
Author, co-author :
Schnitzler, François ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
ammar, sourour
leray, philippe
Geurts, Pierre ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Wehenkel, Louis ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Language :
English
Title :
Efficiently approximating Markov tree bagging for high-dimensional density estimation
Publication date :
September 2011
Event name :
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases
Event organizer :
Prof. A. Likas Prof. Y. Theodoridis
Event place :
Athens, Greece
Event date :
from 05-09-2011 to 09-09-2011
Audience :
International
Main work title :
Machine Learning and Knowledge Discovery in Databases, Part III
FRIA - Fonds pour la Formation à la Recherche dans l'Industrie et dans l'Agriculture Biomagnet IUAP network of the Belgian Science Policy Office Pascal2 network of excellence of the EC
Aliferis, C., Statnikov, A., Tsamardinos, I., Mani, S., Koutsoukos, X.: Local causal and markov blanket induction for causal discovery and feature selection for classification part i: Algorithms and empirical evaluation. JMLR 11, 171-234 (2010)
Ammar, S., Leray, P., Defourny, B., Wehenkel, L.: Probability density estimation by perturbing and combining tree structured Markov networks. In: Sossai, C., Chemello, G. (eds.) ECSQARU 2009. LNCS, vol. 5590, pp. 156-167. Springer, Heidelberg (2009)
Ammar, S., Leray, P., Schnitzler, F., Wehenkel, L.: Sub-quadratic Markov tree mixture learning based on randomizations of the Chow-Liu algorithm. In: The Fifth European Workshop on Probabilistic Graphical Models, pp. 17-24 (2010)
Ammar, S., Leray, P.,Wehenkel, L.: Sub-quadratic Markov tree mixture models for probability density estimation. In: 19th International Conference on Computational Statistics (COMP-STAT 2010), pp. 673-680 (2010)
Auvray, V., Wehenkel, L.: On the construction of the inclusion boundary neighbourhood for Markov equivalence classes of bayesian network structures. In: Proceedings of 18th Conference on Uncertainty in Artificial Intelligence, pp. 26-35 (2002)
Auvray, V., Wehenkel, L.: Learning inclusion-optimal chordal graphs. In: Proceedings of 24th Conference on Uncertainty in Artificial Intelligence, pp. 18-25 (2008)
Bach, F.R., Jordan, M.I.: Thin junction trees. In: Advances in Neural Information Processing Systems, vol. 14, pp. 569-576. MIT Press, Cambridge (2001)
Breiman, L.: Arcing classifiers. Tech. rep., Dept. of Statistics, University of California (1996)
Chazelle, B.: A minimum spanning tree algorithm with inverse-Ackermann type complexity. J. ACM 47(6), 1028-1047 (2000)
Chow, C., Liu, C.: Approximating discrete probability distributions with dependence trees. IEEE Trans. Inf. Theory 14, 462-467 (1968)
Cooper, G.: The computational complexity of probabilistic inference using bayesian belief networks. Artificial Intelligence 42(2-3), 393-405 (1990)
Efron, B., Tibshirani, R.: An introduction to the bootstrap. Chapman & Hall, Boca Raton (1993)
Elidan, G.: Bagged structure learning of bayesian network. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (2011)
Friedman, N., Goldszmidt, M., Wyner, A.: Data analysis with bayesian networks: A bootstrap approach. In: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pp. 196-205. (1999)
Friedman, N., Nachman, I., Peér, D.: Learning bayesian network structure from massive datasets: The "sparse candidate" algorithm. In: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pp. 206-215 (1999)
Kirshner, S., Smyth, P.: Infinite mixtures of trees. In: ICML 2007: Proceedings of the 24th International Conference on Machine Learning, pp. 417-423. ACM, New York (2007)
Koller, D., Friedman, N.: Probabilistic Graphical Models. MIT Press, Cambridge (2009)
Kruskal, J.B.: On the shortest spanning subtree of a graph and the traveling salesman problem. Proceedings of the American Mathematical Society 7(1), 48-50 (1956)
Kullback, S., Leibler, R.: On information and sufficiency. Ann. Math. Stat. 22(1), 79-86 (1951)
Kumar, M.P., Koller, D.: Learning a small mixture of trees. In: Advances in Neural Information Processing Systems, vol. 22, pp. 1051-1059 (2009)
Kwisthout, J.H., Bodlaender, H.L., Van der Gaag, L.: The necessity of bounded treewidth for efficient inference in bayesian networks. In: the 19th European Conference on Artificial Intelligence, pp. 623-626 (2010)
Liu, H., Xu, M., Haijie Gu, A.G., Lafferty, J., Wasserman, L.: Forest density estimation. JMLR 12, 907-951 (2011)
Madigan, D., Raftery, A.,Wermuth, N., York, J., Zucchini,W.: Model Selection and Accounting for Model Uncertainty in Graphical Models Using Occam's Window. Journal of the American Statistical Association 89, 1535-1546 (1994)
Meila, M., Jordan, M.: Learning with mixtures of trees. JMLR 1, 1-48 (2001)
Meila, M., Jaakkola, T.: Tractable bayesian learning of tree belief networks. In: Proc. of the Sixteenth Conference on Uncertainty in Artificial Intelligence, pp. 380-388. Morgan Kaufmann, San Francisco (2000)
Ormoneit, D., Tresp, V.: Improved gaussian mixture density estimates using bayesian penalty terms and network averaging. In: Advances in Neural Information Processing Systems, pp. 542-548. MIT Press, Cambridge (1995)
Pearl, J.: Probabilistic reasoning in intelligent systems: networks of plausible inference. Morgan Kaufmann, San Francisco (1988)
Schnitzler, F., Leray, P., Wehenkel, L.: Towards sub-quadratic learning of probability density models in the form of mixtures of trees. In: 18th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2010), Bruges, Belgium, pp. 219-224 (2010)
Shahaf, D., Chechetka, A., Guestrin, C.: Learning thin junction trees via graph cuts. In: Artificial Intelligence and Statistics (AISTATS), pp. 113-120 (2009)
Tarjan, R.E.: Data structures and network algorithms. Society for Industrial and Applied Mathematics, Philadelphia (1983)
Wehenkel, L.: Decision tree pruning using an additive information quality measure. In: Uncertainty in Intelligent Systems, pp. 397-411 (1993)