Article (Scientific journals)
The Deep Quality-Value Family of Deep Reinforcement Learning Algorithms
Sabatelli, Matthia; Louppe, Gilles; Geurts, Pierre et al.
2020In International Joint Conference on Neural Networks (IJCNN 2020)
Peer reviewed
 

Files


Full Text
IJCNN_DQV_Family.pdf
Publisher postprint (1.39 MB)
Download
Annexes
IJCNN_presentation_DQV_family.pdf
Publisher postprint (2.37 MB)
Download

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
model-free deep reinforcement learning; temporal-difference learning; DQV, DQV-Max-Learning
Abstract :
[en] We present a novel approach for learning an ap-proximation of the optimal state-action value function (Q) in model-free Deep Reinforcement Learning (DRL). We propose to learn this approximation while simultaneously learning an approximation of the state-value function (V ). We introduce two new DRL algorithms, called DQV-Learning and DQV-Max Learning, which follow this specific learning dynamic. In short, both algorithms use two neural networks for separately learning the V function and the Q function. We validate the effectiveness of this training scheme by thoroughly comparing our algorithms to DRL methods which only learn an approximation of the Q function, namely DQN and DDQN. Our results show that DQV and DQV-Max present several important benefits: they converge significantly faster, can achieve super-human performance on DRL testbeds on which DQN and DDQN failed to do so, and suffer less from the overestimation bias of the Q function.
Disciplines :
Computer science
Author, co-author :
Sabatelli, Matthia ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Algorith. des syst. en interaction avec le monde physique
Louppe, Gilles  ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Big Data
Geurts, Pierre  ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Algorith. des syst. en interaction avec le monde physique
Wiering, Marco
Language :
English
Title :
The Deep Quality-Value Family of Deep Reinforcement Learning Algorithms
Publication date :
July 2020
Journal title :
International Joint Conference on Neural Networks (IJCNN 2020)
Peer reviewed :
Peer reviewed
Available on ORBi :
since 29 July 2020

Statistics


Number of views
173 (17 by ULiège)
Number of downloads
557 (15 by ULiège)

Bibliography


Similar publications



Contact ORBi