Paper published in a book (Scientific congresses and symposiums)
Aggregating Optimistic Planning Trees for Solving Markov Decision Processes
Kedenburg, Gunnar; Fonteneau, Raphaël; Munos, Rémi
2013In Advances in Neural Information Processing Systems 26 (2013)
Peer reviewed
 

Files


Full Text
nips2013.pdf
Publisher postprint (276.47 kB)
Download

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
Reinforcement Learning; Markov Decision Processes; On-line Planning
Abstract :
[en] This paper addresses the problem of online planning in Markov decision processes using a randomized simulator, under a budget constraint. We propose a new algorithm which is based on the construction of a forest of planning trees, where each tree corresponds to a random realization of the stochastic environment. The trees are constructed using a “safe” optimistic planning strategy combining the optimistic principle (in order to explore the most promising part of the search space first) with a safety principle (which guarantees a certain amount of uniform exploration). In the decision-making step of the algorithm, the individual trees are aggregated and an immediate action is recommended. We provide a finite-sample analysis and discuss the trade-off between the principles of optimism and safety. We also report numerical results on a benchmark problem. Our algorithm performs as well as state-of-the-art optimistic planning algorithms, and better than a related algorithm which additionally assumes the knowledge of all transition distributions.
Disciplines :
Computer science
Author, co-author :
Kedenburg, Gunnar
Fonteneau, Raphaël ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Munos, Rémi;  Inria Lille - Nord Europe
Language :
English
Title :
Aggregating Optimistic Planning Trees for Solving Markov Decision Processes
Publication date :
2013
Event name :
Neural Information Processing Systems 26 (2013)
Event place :
Lake Tahoe, United States
Event date :
December 5-10, 2013
Audience :
International
Main work title :
Advances in Neural Information Processing Systems 26 (2013)
Pages :
2382-2390
Peer reviewed :
Peer reviewed
Available on ORBi :
since 20 January 2014

Statistics


Number of views
39 (0 by ULiège)
Number of downloads
28 (0 by ULiège)

Scopus citations®
 
2
Scopus citations®
without self-citations
1

Bibliography


Similar publications



Contact ORBi