Generating informative trajectories by using bounds on the return of control policies

Fonteneau, Raphaël; Murphy, Susan; Wehenkel, Louis; Ernst, Damien

Paper published in a book (Scientific congresses and symposiums)

Fonteneau, Raphaël; Murphy, Susan; Wehenkel, Louis et al.

2010 • In Proceedings of the Workshop on Active Learning and Experimental Design 2010 (in conjunction with AISTATS 2010)

Peer reviewed

Permalink
https://hdl.handle.net/2268/36015

Files (2)Send to Details Statistics Bibliography Similar publications

Files

Full Text

Fonteneau2010ALED.pdf

Publisher postprint (133.79 kB)

Download

Annexes

NIPS-2010-talk.pdf

Publisher postprint (474.29 kB)

This paper together with the three papers "Model-free Monte Carlo-like policy evaluation", "Inferring bounds on the performance of a control policy from a sample of trajectories" and "A cautious approach to generalization in reinforcement learning" represent a body of work in batch-mode RL which is based on the rebuilding of trajectories. This file is a presentation of this body of work.

Download

All documents in ORBi are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

reinforcement learning; optimal control; sampling strategies

Abstract :

[en] We propose new methods for guiding the generation of informative trajectories when solving discrete-time optimal control problems. These methods exploit recently published results that provide ways for computing bounds on the return of control policies from a set of trajectories.

Disciplines :

Computer science

Author, co-author :

Fonteneau, Raphaël ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation

Murphy, Susan

Wehenkel, Louis ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation

Ernst, Damien ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation

Language :

English

Title :

Generating informative trajectories by using bounds on the return of control policies

Publication date :

May 2010

Event name :

Workshop on Active Learning and Experimental Design 2010 (in conjunction with AISTATS 2010)

Event place :

Chia Laguna, Sardinia, Italy

Event date :

May 16, 2010

Audience :

International

Main work title :

Proceedings of the Workshop on Active Learning and Experimental Design 2010 (in conjunction with AISTATS 2010)

Peer review/Selection committee :

Peer reviewed

Funders :

FRIA - Fonds pour la Formation à la Recherche dans l'Industrie et dans l'Agriculture
F.R.S.-FNRS - Fonds de la Recherche Scientifique

Available on ORBi :

since 14 May 2010

Statistics

Number of views

163 (13 by ULiège)

Number of downloads

144 (7 by ULiège)

More statistics