Reference : Min max generalization for deterministic batch mode reinforcement learning: relaxatio...
Scientific journals : Article
Engineering, computing & technology : Computer science
http://hdl.handle.net/2268/156574
Min max generalization for deterministic batch mode reinforcement learning: relaxation schemes
English
Fonteneau, Raphaël mailto [Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation >]
Ernst, Damien mailto [Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Smart grids >]
Boigelot, Bernard mailto [Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Informatique >]
Louveaux, Quentin mailto [Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Système et modélisation : Optimisation discrète >]
2013
SIAM Journal on Control and Optimization
Society for Industrial & Applied Mathematics
51
5
3355–3385
Yes (verified by ORBi)
International
0363-0129
[en] reinforcement learning ; min max generalization ; nonconvex optimization ; computational complexity
[en] We study the min max optimization problem introduced in Fonteneau et al. [Towards min max reinforcement learning, ICAART 2010, Springer, Heidelberg, 2011, pp. 61–77] for computing policies for batch mode reinforcement learning in a deterministic setting with fixed, finite time horizon. First, we show that the min part of this problem is NP-hard. We then provide two relaxation schemes. The first relaxation scheme works by dropping some constraints in order to obtain a problem that is solvable in polynomial time. The second relaxation scheme, based on a Lagrangian relaxation where all constraints are dualized, can also be solved in polynomial time. We also theoretically prove and empirically illustrate that both relaxation schemes provide better results than those given in [Fonteneau et al., 2011, as cited above].
Researchers ; Professionals ; Students
http://hdl.handle.net/2268/156574
10.1137/120867263

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Open access
fonteneau-MIN-MAX-BMRL.pdfPublisher postprint646.06 kBView/Open

Bookmark and Share SFX Query

All documents in ORBi are protected by a user license.