Closed-form dual perturb and combine for tree-based models

Geurts, Pierre; Wehenkel, Louis

doi:10.1145/1102351.1102381

Download

Paper published in a book (Scientific congresses and symposiums)

Closed-form dual perturb and combine for tree-based models

Geurts, Pierre; Wehenkel, Louis

2005 • In Proceedings of the International Conference on Machine Learning (ICML 2005)

Peer reviewed

Permalink
https://hdl.handle.net/2268/25764

DOI
10.1145/1102351.1102381

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

geurts-icml-2005.pdf

Publisher postprint (218 kB)

Download

All documents in ORBi are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

machine learning; optimisation

Abstract :

[en] This paper studies the aggregation of predictions made by tree-based models for several perturbed versions of the attribute vector of a test case. A closed-form approximation of this scheme combined with cross-validation to tune the level of perturbation is proposed. This yields soft-tree models in a parameter free way, and reserves their interpretability. Empirical evaluations, on classiﬁcation and regression problems, show that accuracy and bias/variance tradeoﬀ are improved signiﬁcantly at the price of an acceptable computational overhead. The method is further compared and combined with tree bagging.

Disciplines :

Computer science

Author, co-author :

Geurts, Pierre ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation

Wehenkel, Louis ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation

Language :

English

Title :

Closed-form dual perturb and combine for tree-based models

Publication date :

2005

Event name :

22nd International Conference on Machine Learning

Event place :

Bonn, Germany

Event date :

2005

Audience :

International

Main work title :

Proceedings of the International Conference on Machine Learning (ICML 2005)

Peer review/Selection committee :

Peer reviewed

Additional URL :

http://www.montefiore.ulg.ac.be/services/stochastic/pubs/2005/GW05

Available on ORBi :

since 16 October 2009

Statistics

Number of views

338 (9 by ULiège)

Number of downloads

333 (10 by ULiège)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenCitations

OpenAlex citations

Bibliography

Blake, C., & Merz, C. (1998). UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/.
Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123-140.
Breiman, L. (1998). Arcing classifiers. Annals of statistics, 26, 801-849.
Breiman, L. (2000). Randomizing outputs to increase prediction accuracy. Machine Learning, 40, 229-242.
Breiman, L. (2001). Random forests. Machine learning, 45, 5-32.
Breiman, L., Friedman, J., Olsen, R., & Stone, C. (1984). Classification and regression trees. Wadsworth International (California).
Carter, C., & Catlett, J. (1987). Assessing credit card applications using machine learning. IEEE Expert, Fall, 71-79.
Dahmen, J., Keysers, D., & Ney, H. (2001). Combined classification of handwritten digits using the "virtual test sample method". Proc. of the Second International Workshop on Multiple Classifier Systems, Cambrige, UK (pp. 109-118).
Dietterich, T. G. (2000). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40, 139-157.
Friedman, J. (1991). Multivariate adaptive regression splines. Annals of Statistics, 19.
Friedman, J. (1997). On bias, variance, 0/1-loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery, 1, 55-77.
Friedman, J. H. (1996). Local learning based on recursive covering (Technical Report). Department of Statistics, Stanford University.
Geurts, P. (2001). Dual perturb and combine algorithm. Proc. of the Eighth International Workshop on Artificial Intelligence and Statistics (pp. 196-201). Key-West, Florida.
Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 832-844.
Jordan, M. I. (1994). A statistical approach to decision tree modeling. Proc. of the Seventh Annual ACM Conference on Computational Learning Theory. New York.. ACM Press.
Ling, C., & Van, R. (2003). Decision trees with better ranking. Proceedings of the 20th International Conference on Machine Learning (ICML-2003) (pp. 480-487). Washington DC.
Nadeau, C., & Bengio, Y. (2003). Inference for the generalization error. Machine Learning, 52, 239-281.
Olaru, C., & Wehenkel, L. (2003). A complete fuzzy decision tree technique. Fuzzy Sets and Systems, 138, 221-254.
Quinlan, J. (1986). C4.5: Programs for machine learning. Morgan Kaufmann (San Mateo).
Torgo, L. (1999). Inductive learning of tree-based regression models. Doctoral dissertation, University of Porto.
Wehenkel, L. (1998). Automatic learning techniques in power systems. Boston: Kluwer Academic.