[en] Batch mode reinforcement learning is a subclass of reinforcement learning for which the decision making problem has to be addressed without model, using trajectories only (no model, nor simulator nor additional interactions with the actual system). In this setting, we propose a discussion about a minmax approach to generalization for deterministic problems with continuous state space. This approach aims at computing robust policies considering the fact that the sample of trajectories may be arbitrarily bad. This discussion will be intertwined with the description of a fascinating batch mode reinforcement learning-type problem with trajectories of societies as input, and for which crucial good decisions have to be taken: the energy transition.
Disciplines :
Computer science
Author, co-author :
Fonteneau, Raphaël ; Université de Liège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Language :
English
Title :
From Bad Models to Good Policies: an Intertwined Story about Energy and Reinforcement Learning
Publication date :
12 December 2014
Event name :
2014 NIPS Workshop «From Bad Models to Good Policies Workshop (Sequential Decision Making under Uncertainty)», Montreal, December 12th, 2014