L1-based compression of random forest models

Joly, Arnaud; Schnitzler, François; Geurts, Pierre; Wehenkel, Louis

Download

Paper published in a book (Scientific congresses and symposiums)

L1-based compression of random forest models

Joly, Arnaud; Schnitzler, François; Geurts, Pierre et al.

2012 • In 20th European Symposium on Artificial Neural Networks

Peer reviewed

Permalink
https://hdl.handle.net/2268/112824

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

es2012-43.pdf

Publisher postprint (248.59 kB)

Download

All documents in ORBi are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Ensemble of randomized trees; Pruning; L1-norm regularization; LASSO; Supervised learning; Machine Learning; Randomization; Model reduction; Decision tree

Abstract :

[en] Random forests are effective supervised learning methods applicable to large-scale datasets. However, the space complexity of tree ensembles, in terms of their total number of nodes, is often prohibitive, specially in the context of problems with very high-dimensional input spaces. We propose to study their compressibility by applying a L1-based regularization to the set of indicator functions defined by all their nodes. We show experimentally that preserving or even improving the model accuracy while significantly reducing its space complexity is indeed possible.

Research Center/Unit :

Système et modélisation
GIGA‐R - Giga‐Research - ULiège

Disciplines :

Electrical & electronics engineering
Computer science

Author, co-author :

Joly, Arnaud ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation

Schnitzler, François ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation

Geurts, Pierre ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation

Wehenkel, Louis ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation

Language :

English

Title :

L1-based compression of random forest models

Publication date :

April 2012

Event name :

European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning

Event organizer :

Michel Verleysen

Event place :

Bruges, Belgium

Event date :

25 - 27 April 2012

Audience :

International

Main work title :

20th European Symposium on Artificial Neural Networks

Peer review/Selection committee :

Peer reviewed

Funders :

FRIA - Fonds pour la Formation à la Recherche dans l'Industrie et dans l'Agriculture
Biomagnet IUAP network of the Belgian Science Policy Office
Pascal2 network of excellence of the EC

Available on ORBi :

since 25 February 2012

Statistics

Number of views

814 (82 by ULiège)

Number of downloads

520 (33 by ULiège)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

Bibliography

L. Breiman. Random forests. Machine learning, 45(1):5–32, 2001. 1
P. Geurts, D. Ernst, and L. Wehenkel. Extremely randomized trees. Machine Learning, 63(1):3–42, 2006. 1, 2
P. Geurts. Some enhancements of decision tree bagging. Principles of Data Mining and Knowledge Discovery, pages 141–148, 2000. 1
N Meinshausen. Node harvest. Ann. Appl. Stat., 4(4):2049–2072, 2010. 1
J.H. Friedman and B.E. Popescu. Predictive learning via rule ensembles. The Annals of Applied Statistics, 2(3):916–954, 2008. 1
N. Meinshausen. Forest garrote. Electron. J. Statist., 3:1288–1304, 2009. 1
Simon Bernard, Laurent Heutte, and Sébastien Adam. On the selection of decision trees in Random Forests. In Proceedings of the International Joint Conference on Neural Networks, pages 302–307, France, 2009. 1
G Martínez-Muñoz, D Hernández-Lobato, and A Suárez. An analysis of ensemble pruning techniques based on ordered aggregation. IEEE Trans. Pattern Anal. Mach. Intell., 31:245–259, February 2009. 1
R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), pages 267–288, 1996. 1, 2
T. Hastie, J. Taylor, R. Tibshirani, and G. Walther. Forward stagewise regression and the monotone lasso. Electronic Journal of Statistics, 1:1–29, 2007. 2
J.H. Friedman. Multivariate adaptive regression splines. The Annals of Statistics, pages 1–67, 1991. 3
L. Breiman. Bias, variance, and arcing classifiers. Statistics, 1996. 3
Intel AA&YA. Manufacturing data: Semiconductor tool fault isolation, 11 2008. 3
E. J. Candés and M. B. Wakin. An introduction to compressive sampling. Signal Processing Magazine, IEEE, 25(2):21–30, 2008. 6