Investigation and reduction of discretization Variance in decision tree induction

Geurts, Pierre; Wehenkel, Louis

Download

Paper published in a book (Scientific congresses and symposiums)

Investigation and reduction of discretization Variance in decision tree induction

Geurts, Pierre; Wehenkel, Louis

2000 • In Proceedings of ECML 2000, European Conference on Machine Learning

Peer reviewed

Permalink
https://hdl.handle.net/2268/25762

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

geurts-ecml2000.pdf

Publisher postprint (246.99 kB)

Download

All documents in ORBi are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

machine learning

Abstract :

[en] This paper focuses on the variance introduced by the discretization techniques used to handle continuous attributes in decision tree induction. Different discretization procedures are first studied empirically, then means to reduce the discretization variance are proposed. The experiments shows that discretization variance is large and that it is possible to reduce it significantly without notable computational costs. The resulting variance reduction mainly improves interpretability and stability of decision trees, and marginally their accuracy.

Disciplines :

Computer science

Author, co-author :

Geurts, Pierre ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation

Wehenkel, Louis ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation

Language :

English

Title :

Investigation and reduction of discretization Variance in decision tree induction

Publication date :

2000

Event name :

European Conference on Machine Learning

Event place :

Barcelona, Spain

Event date :

2000

Audience :

International

Main work title :

Proceedings of ECML 2000, European Conference on Machine Learning

Publisher :

Springer-Verlag

Collection name :

LNAI 1810

Pages :

162-170

Peer reviewed :

Peer reviewed

Additional URL :

http://www.montefiore.ulg.ac.be/services/stochastic/pubs/2000/GW00

Available on ORBi :

since 16 October 2009

Statistics

Number of views

66 (6 by ULiège)

Number of downloads

181 (4 by ULiège)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

Bibliography

L. Breiman, J.H. Friedman, R.A. Olsen, and C.J. Stone. Classification and Regression Trees. Wadsworth International (California), 1984. 162, 169
J.R. Quinlan. C4.5: Programs for machine learning. Morgan Kaufmann (San Mateo), 1986. 162
R. Kohavi and D. H. Wolpert. Bias plus variance decomposition for zero-one loss functions. In Proc. of the Thirteenth International Conference on Machine Learning, 1996. 162
J. H. Friedman. Local learning based on recursive covering. Technical report, Department of Statistics, Standford University, August 1996. 162, 168
L. Breiman. Bagging predictors. Technical report, University of California, Department of Statistics, September 1994. 163, 167
R.L. De Mantaras. A distance-based attribute selection measure for decision tree induction. Machine Learning, 6:81–92, 1991. 164
L. Wehenkel. On uncertainty measures used for decision tree induction. In Proc. of Info. Proc. and Manag. Of Uncertainty, pages 413–418, 1996. 164
J. H. Friedman. A recursive partitioning decision rule for nonparametric classifier. IEEE Transactions on Computers, C-26:404–408, 1977. 164
L. Wehenkel. Discretization of continuous attributes for supervised learning: Variance evaluation and variance reduction. In Proc. of The Int. Fuzzy Systems Assoc. World Congress (IFSA’97), pages 381–388, 1997. 166, 168
L. Wehenkel. Automatic learning techniques in power systems. Kluwer Academic, Boston, 1998. 166, 169
P. Geurts. Discretization variance in decision tree induction. Technical report, University of Li`ege, Dept. of Electrical and Computer Engineering, Jan. 2000. (http://www.montefiore.ulg.ac.be/~geurts/) 166
W. Buntine. Learning classification trees. Statistics and Computing, 2:63–73, 1992. 167
Y. Freund and R.E. Schapire. A decision theoretic generalization of on-line learning and an application to boosting. In Proc. of the 2nd European Conference on Computational Learning Theory, pages 23–27. Springer Verlag, 1995. 167
C. Carter and J. Catlett. Assessing credit card applications using machine learning. IEEE Expert, Fall:71–79, 1987. 168
M. I. Jordan. A statistical approach to decision tree modeling. In Proc. of the 7th Annual ACM Conference on Computational Learning Theory. ACM Press, 1994. 168