Master’s dissertation (Dissertations and theses)
Improvement of randomized ensembles of trees for supervised learning in very high dimension
Joly, Arnaud
2011
 

Files


Full Text
JOLY_Arnaud-master_thesis.pdf
Publisher postprint (1.62 MB)
Download
Annexes
résumé AIM.doc
Publisher postprint (40.45 kB)
Request a copy
slides.pdf
Publisher postprint (494.18 kB)
Request a copy
texte_slide.txt
Publisher postprint (7.98 kB)
Request a copy

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
Machine learning; Supervised learning; Ensemble of randomized trees; Pruning; L1-norm Regularisation; LASSO; Sparse model; Randomisation
Abstract :
[en] Tree-based ensemble methods, such as random forests and extremely randomized trees, are methods of choice for handling high dimensional problems. One important drawback of these methods however is the complexity of the models (i.e. the large number and size of trees) they produce to achieve good performances. In this work, several research directions are identified to address this problem. Among those, we have developed the following one. From a tree ensemble, one can extract a set of binary features, each one associated to a leaf or a node of a tree and being true for a given object only if it reaches the corresponding leaf or node when propagated in this tree. Given this representation, the prediction of an ensemble can be simply retrieved by linearly combining these characteristic features with appropriate weights. We apply a linear feature selection method, namely the monotone LASSO, on these features, in order to simplify the tree ensemble. A subtree will then be pruned as soon as the characteristic features corresponding to its constituting nodes are not selected in the linear model. Empirical experiments show that the combination of the monotone LASSO with features extracted from tree ensembles leads at the same time to a drastic reduction of the number of features and can improve the accuracy with respect to unpruned ensembles of trees.
Research Center/Unit :
Systems and Modeling research unit
Disciplines :
Electrical & electronics engineering
Computer science
Author, co-author :
Joly, Arnaud ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Language :
English
Title :
Improvement of randomized ensembles of trees for supervised learning in very high dimension
Alternative titles :
[fr] Amélioration des ensemble d'arbres aléatoire pour de l'apprentissage supervisé en très haute dimension
Defense date :
June 2011
Institution :
ULiège - Université de Liège
Degree :
Master en ingénieur civil électricien, à finalité approfondie
Promotor :
Wehenkel, Louis  ;  Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science
Geurts, Pierre  ;  Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science
President :
Destiné, Jacques ;  Université de Liège - ULiège > Département d'électricité, électronique et informatique (Institut Montefiore)
Jury member :
Louveaux, Quentin  ;  Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science
Van Steen, Kristel  ;  Université de Liège - ULiège > GIGA > GIGA Medical Genomics - Biostatistics, biomedicine and bioinformatics
Available on ORBi :
since 30 November 2011

Statistics


Number of views
295 (57 by ULiège)
Number of downloads
351 (40 by ULiège)

Bibliography


Similar publications



Contact ORBi