Paper published in a book (Scientific congresses and symposiums)
Optimistic Planning for Belief-Augmented Markov Decision Processes
Fonteneau, Raphaël; Busoniu, Lucian; Munos, Rémi
2013In Proceedings 2013 Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL-13), Singapore, 15–19 April 2013
Peer reviewed
 

Files


Full Text
adprl.pdf
Author preprint (475.22 kB)
Download

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
Reinforcement Learning; Bayesian Optimization; Markov Decision Processes
Abstract :
[en] This paper presents the Bayesian Optimistic Planning (BOP) algorithm, a novel model-based Bayesian reinforcement learning approach. BOP extends the planning approach of the Optimistic Planning for Markov Decision Processes (OP-MDP) algorithm [Busoniu2011,Busoniu2012] to contexts where the transition model of the MDP is initially unknown and progressively learned through interactions within the environment. The knowledge about the unknown MDP is represented with a probability distribution over all possible transition models using Dirichlet distributions, and the BOP algorithm plans in the belief-augmented state space constructed by concatenating the original state vector with the current posterior distribution over transition models. We show that BOP becomes Bayesian optimal when the budget parameter increases to infinity. Preliminary empirical validations show promising performance.
Disciplines :
Computer science
Author, co-author :
Fonteneau, Raphaël ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Busoniu, Lucian;  Technical University of Cluj-Napoca
Munos, Rémi;  Inria Lille - Nord Europe
Language :
English
Title :
Optimistic Planning for Belief-Augmented Markov Decision Processes
Publication date :
2013
Event name :
IEEE International Symposium on Adaptive Dynamic Programming and reinforcement Learning (ADPRL 2013)
Event place :
Singapore, Singapore
Event date :
April 16-19, 2013
Audience :
International
Main work title :
Proceedings 2013 Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL-13), Singapore, 15–19 April 2013
Peer reviewed :
Peer reviewed
Available on ORBi :
since 18 January 2014

Statistics


Number of views
43 (6 by ULiège)
Number of downloads
192 (3 by ULiège)

Scopus citations®
 
11
Scopus citations®
without self-citations
7

Bibliography


Similar publications



Contact ORBi