Paper published in a book (Scientific congresses and symposiums)
Inferring bounds on the performance of a control policy from a sample of trajectories
Fonteneau, Raphaël; Murphy, Susan; Wehenkel, Louis et al.
2009In Proceedings of the IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL-09)
Peer reviewed
 

Files


Full Text
bounds-trajectories-adprl.pdf
Publisher postprint (255.42 kB)
Download
Annexes
RL-TU-Delft-2009.pdf
Publisher postprint (299.94 kB)
Download

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
reinforcement learning; model-free; lower bound on a policy; performance guarantee
Abstract :
[en] We propose an approach for inferring bounds on the finite-horizon return of a control policy from an off-policy sample of trajectories collecting state transitions, rewards, and control actions. In this paper, the dynamics, control policy, and reward function are supposed to be deterministic and Lipschitz continuous. Under these assumptions, a polynomial algorithm, in terms of the sample size and length of the optimization horizon, is derived to compute these bounds, and their tightness is characterized in terms of the sample density.
Disciplines :
Computer science
Author, co-author :
Fonteneau, Raphaël ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Murphy, Susan
Wehenkel, Louis  ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Ernst, Damien  ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Language :
English
Title :
Inferring bounds on the performance of a control policy from a sample of trajectories
Publication date :
2009
Event name :
IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL-09)
Event place :
Nashville, United States
Event date :
March 30 - April 2, 2009
Audience :
International
Main work title :
Proceedings of the IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL-09)
ISBN/EAN :
978-1-4244-2761-1
Pages :
117-123
Peer reviewed :
Peer reviewed
Funders :
F.R.S.-FNRS - Fonds de la Recherche Scientifique [BE]
Available on ORBi :
since 03 June 2009

Statistics


Number of views
73 (11 by ULiège)
Number of downloads
238 (7 by ULiège)

Scopus citations®
 
10
Scopus citations®
without self-citations
5
OpenCitations
 
5

Bibliography


Similar publications



Contact ORBi