Unpublished conference/Abstract (Scientific congresses and symposiums)
Accelerating Random Forests in Scikit-Learn
Louppe, Gilles
2014EuroScipy 2014
 

Files


Full Text
slides.pdf
Author preprint (1.99 MB)
Download

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
machine learning; scikit-learn; python; random forests
Abstract :
[en] Random Forests are without contest one of the most robust, accurate and versatile tools for solving machine learning tasks. Implementing this algorithm properly and efficiently remains however a challenging task involving issues that are easily overlooked if not considered with care. In this talk, we present the Random Forests implementation developed within the Scikit-Learn machine learning library. In particular, we describe the iterative team efforts that led us to gradually improve our codebase and eventually make Scikit-Learn's Random Forests one of the most efficient implementations in the scientific ecosystem, across all libraries and programming languages. Algorithmic and technical optimizations that have made this possible include: - An efficient formulation of the decision tree algorithm, tailored for Random Forests; - Cythonization of the tree induction algorithm; - CPU cache optimizations, through low-level organization of data into contiguous memory blocks; - Efficient multi-threading through GIL-free routines; - A dedicated sorting procedure, taking into account the properties of data; - Shared pre-computations whenever critical. Overall, we believe that lessons learned from this case study extend to a broad range of scientific applications and may be of interest to anybody doing data analysis in Python.
Disciplines :
Computer science
Author, co-author :
Louppe, Gilles  ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Language :
English
Title :
Accelerating Random Forests in Scikit-Learn
Publication date :
29 August 2014
Event name :
EuroScipy 2014
Event place :
Cambridge, United Kingdom
Event date :
from 27-08-2014 to 30-08-2014
Audience :
International
Available on ORBi :
since 09 September 2014

Statistics


Number of views
278 (10 by ULiège)
Number of downloads
2179 (6 by ULiège)

Bibliography


Similar publications



Contact ORBi