Contribution to collective works (Parts of books)
Towards min max generalization in reinforcement learning
Fonteneau, Raphaël; Murphy, Susan; Wehenkel, Louis et al.
2011In Filipe, Joaquim; Fred, Ana; Sharp, Bernadette (Eds.) Agents and Artificial Intelligence: International Conference, ICAART 2010, Valencia, Spain, January 2010, Revised Selected Papers
Peer reviewed
 

Files


Full Text
towards-min-max-generalisation-RL.pdf
Publisher postprint (463.93 kB)
Download

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
reinforcement learning; generalization
Abstract :
[en] In this paper, we introduce a min max approach for addressing the generalization problem in Reinforcement Learning. The min max approach works by determining a sequence of actions that maximizes the worst return that could possibly be obtained considering any dynamics and reward function compatible with the sample of trajectories and some prior knowledge on the environment. We consider the particular case of deterministic Lipschitz continuous environments over continuous state spaces, nite action spaces, and a nite optimization horizon. We discuss the non-triviality of computing an exact solution of the min max problem even after reformulating it so as to avoid search in function spaces. For addressing this problem, we propose to replace, inside this min max problem, the search for the worst environment given a sequence of actions by an expression that lower bounds the worst return that can be obtained for a given sequence of actions. This lower bound has a tightness that depends on the sample sparsity. From there, we propose an algorithm of polynomial complexity that returns a sequence of actions leading to the maximization of this lower bound. We give a condition on the sample sparsity ensuring that, for a given initial state, the proposed algorithm produces an optimal sequence of actions in open-loop. Our experiments show that this algorithm can lead to more cautious policies than algorithms combining dynamic programming with function approximators.
Disciplines :
Electrical & electronics engineering
Author, co-author :
Fonteneau, Raphaël ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Murphy, Susan
Wehenkel, Louis  ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Ernst, Damien  ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Language :
English
Title :
Towards min max generalization in reinforcement learning
Publication date :
2011
Main work title :
Agents and Artificial Intelligence: International Conference, ICAART 2010, Valencia, Spain, January 2010, Revised Selected Papers
Editor :
Filipe, Joaquim
Fred, Ana
Sharp, Bernadette
Publisher :
Springer
ISBN/EAN :
978-3-642-19889-2
Pages :
61-77
Peer reviewed :
Peer reviewed
Available on ORBi :
since 04 October 2011

Statistics


Number of views
134 (6 by ULiège)
Number of downloads
432 (4 by ULiège)

Bibliography


Similar publications



Contact ORBi