Deep Quality Value (DQV) Learning

[en] We introduce a novel Deep Reinforcement Learning (DRL) algorithm called Deep Quality-Value (DQV) Learning. DQV uses temporal-difference learning to train a Value neural network and uses this network for training a second Quality-value network that learns to estimate state-action values. We first test DQV’s update rules with Multilayer Perceptrons as function approximators on two classic RL problems, and then extend DQV with the use of Deep Convolutional Neural Networks, ‘Experience Replay’ and ‘Target Neural Networks’ for tackling four games of the Atari Arcade Learning environment. Our results show that DQV learns significantly faster and better than Deep Q-Learning and Double Deep Q-Learning, suggesting that our algorithm can potentially be a better performing synchronous temporal difference algorithm than what is currently present in DRL.

Disciplines :

Computer science

Author, co-author :

Sabatelli, Matthia ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Algorith. des syst. en interaction avec le monde physique

Louppe, Gilles ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Big Data

Geurts, Pierre ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Algorith. des syst. en interaction avec le monde physique

Wiering, Marco

Language :

English

Title :

Deep Quality Value (DQV) Learning

Publication date :

07 December 2018

Journal title :

Advances in Neural Information Processing Systems

ISSN :

1049-5258

Publisher :

Morgan Kaufmann Publishers, San Mateo, United States - California

Peer reviewed :

Peer Reviewed verified by ORBi

Available on ORBi :

since 15 December 2018

Statistics

Number of views

157 (24 by ULiège)

Number of downloads

112 (12 by ULiège)

More statistics

See more details

Bibliography

Similar publications

Name

Provider / Domaine

Expiration

Description

JSESSIONID

Oracle Corporation

www.uliege.be

Session

General purpose platform session cookie, used by sites written in JSP. Usually used to maintain an anonymous user session by the server.

CookieScriptConsent

CookieScript

.uliege.be

1 year

This cookie is used by Cookie-Script.com service to remember visitor cookie consent preferences. It is necessary for Cookie-Script.com cookie banner to work properly.

Name

Provider / Domaine

Expiration

Description

_pk_id

InnoCraft Ltd

.uliege.be

1 year

Used to store a few details about the user such as the unique visitor ID

_pk_ses

InnoCraft Ltd

.uliege.be

30 minutes

Short lived cookies used to temporarily store data for the visit

_pk_ref

InnoCraft Ltd

.uliege.be

6 months

Used to store the attribution information, the referrer initially used to visit the website

Name	Provider / Domaine	Expiration	Description
JSESSIONID	Oracle Corporation www.uliege.be	Session	General purpose platform session cookie, used by sites written in JSP. Usually used to maintain an anonymous user session by the server.
CookieScriptConsent	CookieScript .uliege.be	1 year	This cookie is used by Cookie-Script.com service to remember visitor cookie consent preferences. It is necessary for Cookie-Script.com cookie banner to work properly.

Name	Provider / Domaine	Expiration	Description
_pk_id	InnoCraft Ltd .uliege.be	1 year	Used to store a few details about the user such as the unique visitor ID
_pk_ses	InnoCraft Ltd .uliege.be	30 minutes	Short lived cookies used to temporarily store data for the visit
_pk_ref	InnoCraft Ltd .uliege.be	6 months	Used to store the attribution information, the referrer initially used to visit the website