The Deep Quality-Value Family of Deep Reinforcement Learning Algorithms

Sabatelli, Matthia; Louppe, Gilles; Geurts, Pierre; Wiering, Marco

Download

Article (Scientific journals)

The Deep Quality-Value Family of Deep Reinforcement Learning Algorithms

Sabatelli, Matthia; Louppe, Gilles; Geurts, Pierre et al.

2020 • In International Joint Conference on Neural Networks (IJCNN 2020)

Peer reviewed

Permalink
https://hdl.handle.net/2268/249809

Files (2)Send to Details Statistics Bibliography Similar publications

Files

Full Text

IJCNN_DQV_Family.pdf

Publisher postprint (1.39 MB)

Download

Annexes

IJCNN_presentation_DQV_family.pdf

Publisher postprint (2.37 MB)

Download

All documents in ORBi are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

model-free deep reinforcement learning; temporal-difference learning; DQV, DQV-Max-Learning

Abstract :

[en] We present a novel approach for learning an ap-proximation of the optimal state-action value function (Q) in model-free Deep Reinforcement Learning (DRL). We propose to learn this approximation while simultaneously learning an approximation of the state-value function (V ). We introduce two new DRL algorithms, called DQV-Learning and DQV-Max Learning, which follow this specific learning dynamic. In short, both algorithms use two neural networks for separately learning the V function and the Q function. We validate the effectiveness of this training scheme by thoroughly comparing our algorithms to DRL methods which only learn an approximation of the Q function, namely DQN and DDQN. Our results show that DQV and DQV-Max present several important benefits: they converge significantly faster, can achieve super-human performance on DRL testbeds on which DQN and DDQN failed to do so, and suffer less from the overestimation bias of the Q function.

Disciplines :

Computer science

Author, co-author :

Sabatelli, Matthia ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Algorith. des syst. en interaction avec le monde physique

Louppe, Gilles ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Big Data

Geurts, Pierre ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Algorith. des syst. en interaction avec le monde physique

Wiering, Marco

Language :

English

Title :

The Deep Quality-Value Family of Deep Reinforcement Learning Algorithms

Publication date :

July 2020

Journal title :

International Joint Conference on Neural Networks (IJCNN 2020)

Peer reviewed :

Peer reviewed

Available on ORBi :

since 29 July 2020

Statistics

Number of views

173 (17 by ULiège)

Number of downloads

557 (15 by ULiège)

More statistics