Reference : Optimistic Planning for Belief-Augmented Markov Decision Processes
Scientific congresses and symposiums : Paper published in a book
Engineering, computing & technology : Computer science
Optimistic Planning for Belief-Augmented Markov Decision Processes
Fonteneau, Raphaël mailto [Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation >]
Busoniu, Lucian mailto [Technical University of Cluj-Napoca > > > >]
Munos, Rémi mailto [Inria Lille - Nord Europe > > > >]
Proceedings 2013 Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL-13), Singapore, 15–19 April 2013
IEEE International Symposium on Adaptive Dynamic Programming and reinforcement Learning (ADPRL 2013)
April 16-19, 2013
[en] Reinforcement Learning ; Bayesian Optimization ; Markov Decision Processes
[en] This paper presents the Bayesian Optimistic Planning (BOP) algorithm, a novel model-based Bayesian reinforcement learning approach. BOP extends the planning approach of the Optimistic Planning for Markov Decision Processes (OP-MDP) algorithm [Busoniu2011,Busoniu2012] to contexts where the transition model of the MDP is initially unknown and progressively learned through interactions within the environment. The knowledge about the unknown MDP is represented with a probability distribution over all possible transition models using Dirichlet distributions, and the BOP algorithm plans in the belief-augmented state space constructed by concatenating the original state vector with the current posterior distribution over transition models. We show that BOP becomes Bayesian optimal when the budget parameter increases to infinity. Preliminary empirical validations show promising performance.

File(s) associated to this reference

Fulltext file(s):

Open access
adprl.pdfAuthor preprint464.08 kBView/Open

Bookmark and Share SFX Query

All documents in ORBi are protected by a user license.