Batch mode reinforcement learning; Min max generalization; Non-convex optimization
Abstract :
[en] We study the min max optimization problem introduced in [Fonteneau, 2011] for computing policies for batch mode reinforcement learning in a deterministic setting. This problem is NP-hard. We focus on the two-stage case for which we provide two relaxation schemes. The first relaxation scheme works by dropping some constraints in order to obtain a problem that is solvable in polynomial time. The second relaxation scheme, based on a Lagrangian relaxation where all constraints are dualized, leads to a conic quadratic programming problem. Both relaxation schemes are shown to provide better results than those given in [Fonteneau, 2011].
Disciplines :
Computer science
Author, co-author :
Fonteneau, Raphaël ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Ernst, Damien ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Smart grids
Boigelot, Bernard ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Informatique
Louveaux, Quentin ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Système et modélisation : Optimisation discrète
Language :
English
Title :
Relaxation schemes for min max generalization in deterministic batch mode reinforcement learning
Publication date :
December 2011
Event name :
4th International NIPS Workshop on Optimization for Machine Learning (OPT 2011)
Event place :
Sierra Nevada, Spain
Event date :
December 16th, 2011
Audience :
International
Main work title :
4th International NIPS Workshop on Optimization for Machine Learning (OPT 2011)
Peer reviewed :
Peer reviewed
Funders :
F.R.S.-FNRS - Fonds de la Recherche Scientifique [BE]