Article (Scientific journals)
Min max generalization for deterministic batch mode reinforcement learning: relaxation schemes
Fonteneau, Raphaël; Ernst, Damien; Boigelot, Bernard et al.
2013In SIAM Journal on Control and Optimization, 51 (5), p. 3355–3385
Peer Reviewed verified by ORBi
 

Files


Full Text
fonteneau-MIN-MAX-BMRL.pdf
Publisher postprint (661.57 kB)
Download

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
reinforcement learning; min max generalization; nonconvex optimization; computational complexity
Abstract :
[en] We study the min max optimization problem introduced in Fonteneau et al. [Towards min max reinforcement learning, ICAART 2010, Springer, Heidelberg, 2011, pp. 61–77] for computing policies for batch mode reinforcement learning in a deterministic setting with fixed, finite time horizon. First, we show that the min part of this problem is NP-hard. We then provide two relaxation schemes. The first relaxation scheme works by dropping some constraints in order to obtain a problem that is solvable in polynomial time. The second relaxation scheme, based on a Lagrangian relaxation where all constraints are dualized, can also be solved in polynomial time. We also theoretically prove and empirically illustrate that both relaxation schemes provide better results than those given in [Fonteneau et al., 2011, as cited above].
Disciplines :
Computer science
Author, co-author :
Fonteneau, Raphaël ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Ernst, Damien  ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Smart grids
Boigelot, Bernard  ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Informatique
Louveaux, Quentin  ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Système et modélisation : Optimisation discrète
Language :
English
Title :
Min max generalization for deterministic batch mode reinforcement learning: relaxation schemes
Publication date :
2013
Journal title :
SIAM Journal on Control and Optimization
ISSN :
0363-0129
eISSN :
1095-7138
Publisher :
Society for Industrial & Applied Mathematics
Volume :
51
Issue :
5
Pages :
3355–3385
Peer reviewed :
Peer Reviewed verified by ORBi
Available on ORBi :
since 29 September 2013

Statistics


Number of views
150 (24 by ULiège)
Number of downloads
291 (16 by ULiège)

Scopus citations®
 
2
Scopus citations®
without self-citations
0
OpenCitations
 
2
OpenAlex citations
 
4

Bibliography


Similar publications



Contact ORBi