[en] This talk describes Gradient Boosted Regression Trees (GBRT), a powerful statistical learning technique with applications in a variety of areas, ranging from web page ranking to environmental niche modeling. GBRT is a key ingredient of many winning solutions in data-mining competitions such as the Netflix Prize, the GE Flight Quest, or the Heritage Health Price.
We give a brief introduction to the GBRT model and regression trees -- focusing on intuition rather than mathematical formulas. The majority of the talk is dedicated to an in depth discussion how to apply GBRT in practice using scikit-learn. We cover important topics such as regularization, model tuning and model interpretation that should significantly improve your score on Kaggle.
Disciplines :
Computer science
Author, co-author :
Prettenhofer, Peter
Louppe, Gilles ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation