2010 • In Myllymäki, Petri; Roos, Antoine; Jaakkola, Tommi (Eds.) Proceedings of the Fifth European Workshop on Probabilistic Graphical Models (PGM-2010)
bayesian networks; Markov Trees; Chow-Liu algorithm; mixture of trees
Abstract :
[en] The present work analyzes different randomized methods to learn Markov tree mixtures
for density estimation in very high-dimensional discrete spaces (very large number n of
discrete variables) when the sample size (N ) is very small compared to n. Several sub-
quadratic relaxations of the Chow-Liu algorithm are proposed, weakening its search proce-
dure. We first study na¨ıve randomizations and then gradually increase the deterministic
behavior of the algorithms by trying to focus on the most interesting edges, either by
retaining the best edges between models, or by inferring promising relationships between
variables. We compare these methods to totally random tree generation and randomiza-
tion based on bootstrap-resampling (bagging), of respectively linear and quadratic com-
plexity. Our results show that randomization becomes increasingly more interesting for
smaller N/n ratios, and that methods based on simultaneously discovering and exploiting
the problem structure are promising in this context.
Research Center/Unit :
Systèmes et Modélisation
Disciplines :
Computer science
Author, co-author :
Ammar, Sourour; Ecole Polytechnique de l’Université de Nantes > Laboratoire d’Informatique de Nantes Atlantique > Knowledge and Decision Team
Leray, Philippe; Ecole Polytechnique de l’Université de Nantes > Laboratoire d’Informatique de Nantes Atlantique > Knowledge and Decision Team
Schnitzler, François ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Wehenkel, Louis ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Language :
English
Title :
Sub-quadratic Markov tree mixture learning based on randomizations of the Chow-Liu algorithm
Publication date :
September 2010
Event name :
The Fifth European Workshop on Probabilistic Graphical Models
Event place :
Helsinki, Finland
Event date :
from 13-09-2010 to 15-09-2010
Audience :
International
Main work title :
Proceedings of the Fifth European Workshop on Probabilistic Graphical Models (PGM-2010)
Editor :
Myllymäki, Petri
Roos, Antoine ; Université de Liège - ULiège > Centres généraux > Centre interdisciplinaire de formation de formateurs de l'ULg (CIFFUL)
Jaakkola, Tommi
ISBN/EAN :
978-952-60-3314-3
Pages :
17-24
Peer reviewed :
Peer reviewed
Funders :
FRIA - Fonds pour la Formation à la Recherche dans l'Industrie et dans l'Agriculture Biomagnet IUAP network of the Belgian Science Policy Office Pascal2 network of excellence of the EC
S. Ammar, Ph. Leray, B. Defourny, and L. Wehenkel. 2008. High-dimensional probability density estimation with randomized ensembles of tree structured Bayesian networks. In Proceedings of the fourth European Workshop on Probabilistic Graphical Models (PGM'08), pages 9-16, Hirtshals, Denmark.
S. Ammar, Ph. Leray, B. Defourny, and L. Wehenkel. 2009. Probability density estimation by perturbing and combining tree structured Markov networks. In Proceedings of the 10th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty (ECSQARU 2009), pages 156-167, Verona, Italy.
S. Ammar, Ph. Leray, and L. Wehenkel. 2010. Subquadratic Markov tree mixture models for probability density estimation. In 19th International Conference on Computational Statistics (COMPSTAT 2010), pages 673-680, Paris, France.
V. Auvray and L. Wehenkel. 2002. On the construction of the inclusion boundary neighbourhood for Markov equivalence classes of Bayesian network structures. In Adnan Darwiche and Nir Friedman, editors, Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence (UAI-02), pages 26-35, S.F., Cal. Morgan Kaufmann Publishers.
L. Breiman. 1996. Arcing classifiers. Technical report, Dept. of Statistics, University of California.
C.K. Chow and C. N. Liu. 1968. Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory, 14(3):462-467.
G.F. Cooper. 1990. The computational complexity of probabilistic inference using bayesian belief networks. Artificial Intelligence, 42(2-3):393-405, March.
T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. 2001. Introduction to Algorithms, Second Edition. MIT Press and McGraw-Hill.
S. Kullback and R. Leibler. 1951. On information and sufficiency. Annals of Mathematical Statistics, 22(1):79-86.
M. Meila and M. I. Jordan. 2000. Learning with mixtures of trees. Journal of Machine Learning Research, 1:1-48.
J. Pearl. 1986. Fusion, propagation, and structuring in belief networks. Artificial Intelligence, 29:241- 288.
F. Schnitzler, Ph. Leray, and L.Wehenkel. 2010. Towards sub-quadratic learning of probability density models in the form of mixtures of trees. In 18th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2010), pages 219-224, Bruges, Belgium.