Bayesian networks; Markov trees; Perturb and Combine; density estimation; Chow-Liu algorithm; Mixture of trees; EM algorithm; Variance reduction
Abstract :
[en] Markov trees, a probabilistic graphical model for density estimation, can be expanded in the form of a weighted average of Markov Trees. Learning these mixtures or ensembles from observations can be performed to reduce the bias or the variance of the estimated model. We propose a new combination of both, where the upper level seeks to reduce bias while the lower level seeks to reduce variance. This algorithm is evaluated empirically on datasets generated from a mixture of Markov trees and from other synthetic densities.
Research Center/Unit :
systems and modelling
Disciplines :
Computer science
Author, co-author :
Schnitzler, François ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Geurts, Pierre ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Wehenkel, Louis ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Language :
English
Title :
Mixtures of Bagged Markov Tree Ensembles
Publication date :
September 2012
Event name :
Sixth European Workshop on Probabilistic Graphical Models
Event place :
Granada, Spain
Event date :
from 19-09-2012 to 21-09-2012
Audience :
International
Main work title :
Proceedings of the 6th European Workshop on Probabilistic Graphical Models
Editor :
Cano Utrera, Andrès
Gómez-Olmedo, Manuel
Nielsen, Thomas
Pages :
283-290
Peer reviewed :
Peer reviewed
Funders :
FRIA - Fonds pour la Formation à la Recherche dans l'Industrie et dans l'Agriculture
S. Ammar, P. Leray, F. Schnitzler, and L. Wehenkel. 2010. Sub-quadratic Markov tree mixture learning based on randomizations ol the Chow-Liu algorithm. In 5th European Workshop on Probabilistic Graphical Models, pages 17-24.
V. Auvray and L. Wehenkel. 2008. Learning inclusion-optimal chordal graphs. In 24-th UAI, pages 18-25.
F. R. Bach and M. I. Jordan. 2001. Thin junction trees. In Advances in Neural Information Processing Systems 14, pages 569-576. MIT Press.
L. Breiman. 1996. Arcing classifiers. Technical report, Dept. ol Statistics, University ol California.
L. Breiman. 1999. Using adaptive bagging to debias regressions. Technical report, Dept. ol Statistics, University ol California.
D.M. Chickering, D. Geiger, and D. Heckerman. 1994. Learning Bayesian networks is NP-hard. Technical report, Microsoft Research.
C.I. Chow and C.N. Liu. 1968. Approximating discrete probability distributions with dependence trees. IEEE Trans. Inf. Theory, 14:462-467.
G.F. Cooper. 1990. The computational complexity ol probabilistic inference using Bayesian beliel networks. Artificial Intelligence, 42(2-3):393-405.
A. P. Dempster, N. M. Laird, and D. B. Rubin. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39:1-38.
G. Elidan and S. Gould. 2008. Learning bounded treewidth Bayesian networks. JMLR, 9:2699-2731.
S. Kirshner and P. Smyth. 2007. Infinite mixtures ol trees. In 24th International Conference on Machine Learning, pages 417-423.
D. Koller and N. Friedman. 2009. Probabilistic Graphical Models. MIT Press.
J. Kollin and M. Koivisto. 2006. Bayesian learning with mixtures ol trees. ECML, pages 294-305.
J. Kwisthout, H. Bodlaender, and L. van der Gaag. 2010. The necessity ol bounded treewidth for efficient inference in Bayesian networks. In 19th European Conference on Artificial Intelligence, pages 623-626.
H. Liu, M. Xu, A. Gupta H. Gu, J. Lafferty, and L. Wasserman. 2011. Forest density estimation. JMLR, 12:907-951.
G.J. McLachlan and T. Krishnan. 2008. The EM Algorithm and Extensions, volume 382. John Wiley and Sons.
M. Meila and T. Jaakkola. 2006. Tractable Bayesian learning ol tree beliel networks. Statistics and Computing, 16:77-92.
M. Meila and M.I. Jordan. 2001. Learning with mixtures ol trees. JMLR, 1:1-48.
J. Pearl. 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufrnann.
F. Schnitzler, Ph. Leray, andL. Wehenkel. 2010. Towards sub-quadratic learning ol probability density models in the form ol mixtures ol trees. In 18th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, pages 219-224.
V.Y.F. Tan, A. Anandkumar, and A. S. Willsky. 2011. Learning high-dimensional Markov forest distributions: Analysis ol error rates. JMLR, 12:1617-1653.