Reference : Lipschitz robust control from off-policy trajectories
Scientific congresses and symposiums : Paper published in a book
Engineering, computing & technology : Computer science
http://hdl.handle.net/2268/172988
Lipschitz robust control from off-policy trajectories
English
Fonteneau, Raphaël mailto [Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation >]
Ernst, Damien mailto [Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Smart grids >]
Boigelot, Bernard mailto [Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Informatique >]
Louveaux, Quentin mailto [Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Système et modélisation : Optimisation discrète >]
2014
Proceedings of the 53rd IEEE Conference on Decision and Control (IEEE CDC 2014)
Yes
No
International
53rd IEEE Conference on Decision and Control (IEEE CDC 2014)
December 15-17, 2014
Los Angeles
USA
[en] We study the minmax optimization problem introduced in [Fonteneau et al. (2011), ``Towards min max reinforcement learning'', Springer CCIS, vol. 129, pp. 61-77] for computing control policies for batch mode reinforcement learning in a deterministic setting with fixed, finite optimization horizon. First, we state that the $\min$ part of this problem is NP-hard. We then provide two relaxation schemes. The first relaxation scheme works by dropping some constraints in order to obtain a problem that is solvable in polynomial time. The second relaxation scheme, based on a Lagrangian relaxation where all constraints are dualized, can also be solved in polynomial time. We theoretically show that both relaxation schemes provide better results than those given in [Fonteneau et al. (2011)]
Researchers
http://hdl.handle.net/2268/172988

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Open access
CDC_SIAM.pdfAuthor preprint349.95 kBView/Open

Bookmark and Share SFX Query

All documents in ORBi are protected by a user license.