[en] One of the main difficulties with standard top down induction of decision trees comes from the high variance of these methods. High variance means that, for a given problem and sample size, the resulting tree is strongly dependent on the random nature of the particular sample used for training. Consequently, these algorithms tend to be suboptimal in terms of accuracy and interpretability. This paper analyses this problem in depth and proposes a new method, relying on threshold softening, able to significantly improve the bias/variance tradeoff of decision trees. The algorithm is validated on a number of benchmark problems and its relationship with fuzzy decision tree induction is discussed. This sheds some light on the success of fuzzy decision tree induction and improves our understanding of machine learning, in general.
Disciplines :
Computer science
Author, co-author :
Geurts, Pierre ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Olaru, Cristina; Université de Liège - ULiège > Dép. d'électricité, électronique et informatique > Systèmes et modélisation
Wehenkel, Louis ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Language :
English
Title :
Improving the bias/variance tradeoff of decision trees - towards soft tree induction
Publication date :
2001
Journal title :
International Journal of Engineering Intelligent Systems for Electrical Engineering and Communications
Friedman J.H. (1997) On bias, variance, 0/1-loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery 1:55-77.
Geurts P. (2000) Some enhancements of decision tree bagging. Proc. of the 4th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD-2000) , Lyon, France, September; 136-147.
Geurts P. (2001) Dual perturb and combine algorithm. Proc. of the Eighth International Workshop on Artificial Intelligence and Statistics , Key-West, Florida, January; 196-201.
Geurts P., Wehenkel L. (2000) Investigation and reduction of discretization variance in decision tree induction. Proc. of the 11th European Conference on Machine Learning (ECML-2000) , Barcelona, May; 162-170.
Webb G.I. (2000) Multiboosting: A technique for combining boosting and wagging. Machine Learning , August; 40(2):158-196.
Wehenkel L. (1993) Decision tree pruning using an additive information quality measure. Uncertainty in Intelligent Systems , B. Bouchon-Meunier, L. Valverde, and R. R. Yager, editors, Elsevier; 397-411.
Wehenkel L. Discretization of continuous attributes for supervised learning. Variance evaluation and variance reduction. Proc. of IFSA'97, Int. Fuzzy Systems Assoc. World Congress, Special session on Learning in a fuzzy framework; .