Paper published in a journal (Scientific congresses and symposiums)
Empirical Analysis of Policy Gradient Algorithms where Starting States are Sampled accordingly to Most Frequently Visited States
Aittahar, Samy; Fonteneau, Raphaël; Ernst, Damien
2020In IFAC-PapersOnLine, 53 (2), p. 8097–8104
Peer Reviewed verified by ORBi
 

Files


Full Text
1-s2.0-S2405896320329396-main.pdf
Publisher postprint (641.34 kB)
Download
Annexes
teaser_slide_so_sober.pdf
Publisher postprint (93.42 kB)
Teaser slide for the IFAC conference
Download
ifac_mcp0_presentation.pdf
Publisher postprint (816.51 kB)
Presentation slides for the IFAC conference
Download
youtube_link.txt
Publisher postprint (49 B)
Presentation video available on Youtube https://www.youtube.com/watch?v=TA3_vaZWP20&t=2s (copy/paste the link into your browser's address bar)
Download

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
Reinforcement learning; Control; Policy gradient
Abstract :
[en] In this paper, we propose an extension to the policy gradient algorithms by allowing starting states to be sampled from a probability distribution that may differ from the one used to specify the reinforcement learning task. In particular, we suggest that, between policy updates, starting states should be sampled from a probability density function which approximates the state visitation frequency of the current policy. Results generated from various environments clearly demonstrate a performance improvement in terms of mean cumulative rewards and substantial update stability compared to vanilla policy gradient algorithms where the starting state distributions are either as specified by the environment or uniform distributions over the state space. A sensitivity analysis over a subset of the hyper-parameters of our algorithm also suggests that they should be adapted after each policy update to maximise the improvements of the policies.
Disciplines :
Computer science
Computer science
Author, co-author :
Aittahar, Samy ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Smart grids
Fonteneau, Raphaël ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Smart grids
Ernst, Damien  ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Smart grids
Language :
English
Title :
Empirical Analysis of Policy Gradient Algorithms where Starting States are Sampled accordingly to Most Frequently Visited States
Publication date :
2020
Event name :
International Federation of Automatic Control (IFAC) 2020
Event date :
From 11-07-2020 to 17-07-2020
Audience :
International
Journal title :
IFAC-PapersOnLine
ISSN :
2405-8971
eISSN :
2405-8963
Publisher :
Elsevier, Kidlington, United Kingdom
Volume :
53
Issue :
2
Pages :
8097–8104
Peer reviewed :
Peer Reviewed verified by ORBi
Available on ORBi :
since 18 March 2020

Statistics


Number of views
206 (35 by ULiège)
Number of downloads
431 (21 by ULiège)

Scopus citations®
 
0
Scopus citations®
without self-citations
0
OpenCitations
 
0

Bibliography


Similar publications



Contact ORBi