Speech/Talk (Diverse speeches and writings)
Batch Mode Reinforcement Learning based on the Synthesis of Artificial Trajectories
Fonteneau, Raphaël
2012
 

Files


Full Text
10_12_2012@CMS-Montreal.pdf
Author postprint (3.04 MB)
Download

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
Batch Mode Reinforcement Learning; Dynamic Treatment Regimes
Abstract :
[en] Batch mode reinforcement learning (BMRL) is a field of research which focuses on the inference of high-performance control policies when the only information on the control problem is gathered in a set of trajectories. Such situations occur for instance in the case of clinical trials, for which data are collected in the form of batch time series of clinical indicators. When the (state, decision) spaces are large or continuous, most of the techniques proposed in the literature for solving BMRL problems combine value or policy iteration schemes from the Dynamic Programming (DP) theory with function approximators representing (state-action) value functions. While successful in many studies, the use of function approximators for solving BMRL problems has also drawbacks. In particular, the use of function approximator makes performance guarantees difficult to obtain, and does not systematically take advantage of optimal trajectories. In this talk, I will present a new line of research for solving BMRL problems based on the synthesis of ``artificial trajectories'' which opens avenues for designing new BMRL algorithms. In particular, it avoids the two above-mentioned drawbacks of the use of function approximator.
Disciplines :
Computer science
Author, co-author :
Fonteneau, Raphaël ;  Université de Liège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Language :
English
Title :
Batch Mode Reinforcement Learning based on the Synthesis of Artificial Trajectories
Publication date :
10 December 2012
Event name :
Winter Meeting of the Canadian Mathematical Society
Event place :
Montreal, Canada
Event date :
from 7-12-2012 to 10-12-2012
Audience :
International
Available on ORBi :
since 02 June 2015

Statistics


Number of views
24 (2 by ULiège)
Number of downloads
104 (1 by ULiège)

Bibliography


Similar publications



Contact ORBi