Doctoral thesis (Dissertations and theses)
Contributions to decision tree induction: bias/variance tradeoff and time series classification
Geurts, Pierre
2002
 

Files


Full Text
geurts-phd2002.pdf
Author postprint (4.29 MB)
Download

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
Machine learning
Abstract :
[en] Because of the rapid progress of computer and information technology, large amounts of data are nowadays available in a lot of domains. Automatic learning aims at developing algorithms able to produce synthetic high-level information, or models, from this data. Learning algorithms are generally evaluated according to three different criteria: interpretability (how well the model helps to understand the data), predictive accuracy (how well the model can predict unseen situations), and computational efficiency (how fast is the algorithm and how it scales to large databases). This thesis explores two issues in automatic learning: the improvement of the well-known decision tree induction method and the problem of learning classification models for time series data. Decision tree induction method is an automatic learning algorithm which focuses on the modeling of input/output relationships. While this algorithm is among the fastest and most interpretable methods, its accuracy is not always competitive with respect to other algorithms. It is commonly admitted that this suboptimality is due to the excessive variance of this method. We first carry out an empirical study which shows quantitatively how important this variance is, i.e. how strongly decision trees depend on the random nature of the database used to infer them. These experiments confirm that this variance is detrimental not only from the point of view of accuracy but also from the point of view of interpretability. With the goal of improving both interpretability and accuracy, we consider three variance reduction techniques for decision trees. First, in the goal of improving mainly interpretability, we propose several methods which try to stabilize the parameters chosen during tree induction. While these methods succeed in reducing the variability of the parameters, they produce only a slight improvement of the accuracy. Then, we consider perturb and combine algorithms (e.g. bagging, boosting) which consist in combining the predictions of several models obtained by randomizing in some way the learning process. Inspired by the high variance of the parameters defining a decision tree, we propose an extremely randomized decision tree induction algorithm, called extra-tree, which chooses all parameters at random during induction. The aggregation of several of these extra-trees gives an important reduction of variance and this algorithm compares favorably in terms of accuracy and computational efficiency with both bagging and boosting. Because of the randomization of the parameters, the resulting method is also competitive with classical decision tree induction in terms of computational efficiency. In addition to these two approaches, we propose a ``dual'' perturb and combine algorithm which delays the perturbation at the prediction stage and hence requires only one model. In combination with decision tree, this method actually bridges the gap between single decision trees and perturb and combine algorithms. Of the first, it saves the interpretability (by using only one model), and with perturb and combine algorithm, it shares some of the accuracy (by reducing the variance). The second topic of the thesis is the problem of time series classification. The most direct way to solve this problem is to apply existing learning algorithms on low-level variables which correspond to the values of a time series at several time points. Experiments with the tree-based algorithms studied in the first part of the thesis shows that this approach is limited. A variance reduction techniques is then proposed specifically for this kind of data which consists in aggregating the prediction given by a classification model for subsequences of time series. Since this method does not provide interpretable models, we propose a second method which extends decision tree tests by allowing them to detect local shift invariant properties, or patterns, in time series. The study proposed in this part of the thesis is only a first step in the domain but our conclusions give some future work directions for handling complex type of data with automatic learning methods.
Disciplines :
Computer science
Author, co-author :
Geurts, Pierre  ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Language :
English
Title :
Contributions to decision tree induction: bias/variance tradeoff and time series classification
Defense date :
May 2002
Number of pages :
260
Institution :
ULiège - Université de Liège
Degree :
Docteur en Sciences Appliquées
Promotor :
Wehenkel, Louis  ;  Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science
Available on ORBi :
since 15 October 2009

Statistics


Number of views
2017 (45 by ULiège)
Number of downloads
1928 (39 by ULiège)

Bibliography


Similar publications



Contact ORBi