Globally Induced Forest: A Prepruning Compression Scheme

Begon, Jean-Michel; Joly, Arnaud; Geurts, Pierre

Download

Paper published in a journal (Scientific congresses and symposiums)

Globally Induced Forest: A Prepruning Compression Scheme

Begon, Jean-Michel; Joly, Arnaud; Geurts, Pierre

2017 • In Proceedings of Machine Learning Research, 70, p. 420-428

Peer Reviewed verified by ORBi

Permalink
https://hdl.handle.net/2268/214279

Files (4)Send to Details Statistics Bibliography Similar publications

Files

Full Text

gif.pdf

Author postprint (583.03 kB)

Article

Download

Full Text Parts

supplementary.pdf

Author postprint (509.13 kB)

Supplementary material

Download

Annexes

gif_icml_pres.pdf

(951.84 kB)

Presentation

Download

gif_icml_poster.pdf

(352.12 kB)

Poster

Download

All documents in ORBi are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Compression; Prepruning; Random Forest; Extremely randomized trees; Iterative model; stagewise

Abstract :

[en] Tree-based ensemble models are heavy memory- wise. An undesired state of affairs consider- ing nowadays datasets, memory-constrained environment and fitting/prediction times. In this paper, we propose the Globally Induced Forest (GIF) to remedy this problem. GIF is a fast prepruning approach to build lightweight ensembles by iteratively deepening the current forest. It mixes local and global optimizations to produce accurate predictions under memory constraints in reasonable time. We show that the proposed method is more than competitive with standard tree-based ensembles under corresponding constraints, and can sometimes even surpass much larger models.

Disciplines :

Computer science

Author, co-author :

Begon, Jean-Michel ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Algorith. des syst. en interaction avec le monde physique

Joly, Arnaud ; Université de Liège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation

Geurts, Pierre ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Algorith. des syst. en interaction avec le monde physique

Language :

English

Title :

Globally Induced Forest: A Prepruning Compression Scheme

Alternative titles :

[fr] Globally Induced Forest: une méthode d'élagage

Publication date :

2017

Event name :

34th International Conference on Machine Learning

Event place :

Sydney, Australia

Event date :

du 7 aout 2017 au 11 aout 2017

Audience :

International

Journal title :

Proceedings of Machine Learning Research

eISSN :

2640-3498

Publisher :

Microtome Publishing, Brookline, United States - Massachusetts

Special issue title :

Proceedings of the 34th International Conference on Machine Learning

Volume :

Pages :

420-428

Peer reviewed :

Peer Reviewed verified by ORBi

Tags :

CÉCI : Consortium des Équipements de Calcul Intensif

Additional URL :

http://proceedings.mlr.press/v70/begon17a.html

Available on ORBi :

since 14 September 2017

Statistics

Number of views

264 (42 by ULiège)

Number of downloads

197 (24 by ULiège)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

Bibliography

Breiman, Leo. Pasting small votes for classification in large databases and on-line. Machine Learning, 36(1-2):85-103, 1999.
Breiman, Leo. Random forests. Machine learning, 45(1):5-32, 2001.
De Vleeschouwcr, Christophe, Legrand, Anthony, Jacques, Laurent, and Hebert, Martial. Mitigating memory requirements for random trees/ferns. In Image Processing (ICIP), 2015 IEEE International Conference on, pp. 227-231. IEEE, 2015.
Domingos, Pedro. Knowledge acquisition from examples via multiple models. In Machine learning-international workshop then conference, pp. 98-106. Morgan Kaufmann publishers, INC., 1997.
Elisha, Oren and Dekel, Shai. Wavelet decompositions of random forests-smoothness analysis, sparse approximation and applications. Journal of Machine Learning Research, 17(198):1-38, 2016.
Freund, Yoav and Schapire, Robert E. A desicion-theoretic generalization of on-line learning and an application to boosting. In European conference on computational learning theory, pp. 23-37. Springer, 1995.
Friedman, Jerome, Hastie, Trevor, and Tibshirani, Robert. The elements of statistical learning, volume 1. Springer series in statistics Springer, Berlin, 2001.
Friedman, Jerome H. Greedy function approximation: a gradient boosting machine. Annals of statistics, pp. 1189-1232, 2001.
Geurts, Pierre, Ernst, Damien, and Wehenkel, Louis. Extremely randomized trees. Machine learning, 63(1):3-42, 2006.
Johnson, Rie and Zhang, Tong. Learning nonlinear functions using regularized greedy forest. IEEE transactions on pattern analysis and machine intelligence, 36(5):942-954, 2014.
Joly, Arnaud, Schnitzler, François, Geurts, Pierre, and Wehenkel, Louis. LI-based compression of random forest models. In 20th European Symposium on Artificial Neural Networks, 2012.
Meinshausen, Nicolai et al. Forest garrote. Electronic Journal of Statistics, 3:1288-1304, 2009.
Menke, Joshua E and Martinez, Tony R. Artificial neural network reduction through oracle learning. Intelligent Data Analysis, 13(1):135-149, 2009.
Pedregosa, Fabian, Varoquaux, Gael, Gramfort, Alexandre, Michel, Vincent, Thirion, Bertrand, Grisel, Olivier, Blondel, Mathieu, Prettcnhofer, Peter, Weiss, Ron, Dubourg, Vincent, et al. Scikit-learn: Machine learning in python. Journal of Machine Learning Research, 12(Oct):2825-2830, 2011.
Peterson, Adam H and Martinez, Tony R. Reducing decision tree ensemble size using parallel decision dags. International Journal on Artificial Intelligence Tools, 18(04):613-620, 2009.
Ren, Shaoqing, Cao, Xudong, Wei, Yichen, and Sun, Jian. Global refinement of random forest. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 723-730, 2015.
Rokach, Lior. Decision forest: Twenty years of research. Information Fusion, 27:111-125, 2016.
Shotton, Jamie, Sharp, Toby, Kohli, Pushmeet, Nowozin, Sebastian, Winn, John, and Criminisi, Antonio. Decision jungles: Compact and rich models for classification. In Advances in Neural Information Processing Systems, pp. 234-242, 2013.
Tsoumakas, Grigorios, Pártalas, Ioannis, and Vlahavas, Ioannis. A taxonomy and short review of ensemble selection. In ECAI2008, workshop on supervised and unsupervised ensemble methods and their applications, pp. 41-46, 2008.
Vens, Celine and Costa, Fabrizio. Random forest based feature induction. In Data Mining (ICDM), 2011 IEEE 11th International Conference on, pp. 744-753. IEEE, 2011.
Zhu, Ji, Zou, Hui, Rosset, Sanaron, and Hastie, Trevor. Multi-class adaboost. Statistics and its Interface, 2(3):349-360, 2009.