Paper published in a book (Scientific congresses and symposiums)
Approximate Bayes Optimal Policy Search using Neural Networks
Castronovo, Michaël; François-Lavet, Vincent; Fonteneau, Raphaël et al.
2017In Proceedings of the 9th International Conference on Agents and Artificial Intelligence (ICAART 2017)
Peer reviewed
 

Files


Full Text
ANN-BRL_final.pdf
Publisher postprint (300.99 kB)
Download

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
Bayesian reinforcement learning; artificial neural networks; offline policy search
Abstract :
[en] Bayesian Reinforcement Learning (BRL) agents aim to maximise the expected collected rewards obtained when interacting with an unknown Markov Decision Process (MDP) while using some prior knowledge. State-of-the-art BRL agents rely on frequent updates of the belief on the MDP, as new observations of the environment are made. This offers theoretical guarantees to converge to an optimum, but is computationally intractable, even on small-scale problems. In this paper, we present a method that circumvents this issue by training a parametric policy able to recommend an action directly from raw observations. Artificial Neural Networks (ANNs) are used to represent this policy, and are trained on the trajectories sampled from the prior. The trained model is then used online, and is able to act on the real MDP at a very low computational cost. Our new algorithm shows strong empirical performance, on a wide range of test problems, and is robust to inaccuracies of the prior distribution.
Disciplines :
Computer science
Author, co-author :
Castronovo, Michaël ;  Université de Liège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Smart grids
François-Lavet, Vincent ;  Université de Liège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Dép. d'électric., électron. et informat. (Inst.Montefiore)
Fonteneau, Raphaël ;  Université de Liège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Dép. d'électric., électron. et informat. (Inst.Montefiore)
Ernst, Damien  ;  Université de Liège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Smart grids
Couëtoux, Adrien ;  Université de Liège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Smart grids
Language :
English
Title :
Approximate Bayes Optimal Policy Search using Neural Networks
Publication date :
February 2017
Event name :
9th International Conference on Agents and Artificial Intelligence (ICAART 2017)
Event place :
Porto, Portugal
Event date :
du 24 février 2017 au 26 février 2017
Audience :
International
Main work title :
Proceedings of the 9th International Conference on Agents and Artificial Intelligence (ICAART 2017)
Peer reviewed :
Peer reviewed
Tags :
CÉCI : Consortium des Équipements de Calcul Intensif
Funders :
F.R.S.-FNRS - Fonds de la Recherche Scientifique [BE]
CÉCI - Consortium des Équipements de Calcul Intensif [BE]
Available on ORBi :
since 16 December 2016

Statistics


Number of views
732 (45 by ULiège)
Number of downloads
833 (23 by ULiège)

Scopus citations®
 
2
Scopus citations®
without self-citations
1
OpenCitations
 
1

Bibliography


Similar publications



Contact ORBi