Article (Scientific journals)
Risk-Sensitive Policy with Distributional Reinforcement Learning
Théate, Thibaut; Ernst, Damien
2023In Algorithms, 16 (325), p. 16
Peer Reviewed verified by ORBi
 

Files


Full Text
algorithms-16-00325.pdf
Author postprint (2.38 MB)
Download

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
Distributional reinforcement learning; Sequential decision-making; Risk-sensitive policy; risk management; deep neural networks
Abstract :
[en] Classical reinforcement learning (RL) techniques are generally concerned with the design of decision-making policies driven by the maximisation of the expected outcome. Nevertheless, this approach does not take into consideration the potential risk associated with the actions taken, which may be critical in certain applications. To address that issue, the present research work introduces a novel methodology based on distributional RL to derive sequential decision-making policies that are sensitive to the risk, the latter being modelled by the tail of the return probability distribution. The core idea is to replace the Q function generally standing at the core of learning schemes in RL by another function, taking into account both the expected return and the risk. Named the risk-based utility function U, it can be extracted from the random return distribution Z naturally learnt by any distributional RL algorithm. This enables the spanning of the complete potential trade-off between risk minimisation and expected return maximisation, in contrast to fully risk-averse methodologies. Fundamentally, this research yields a truly practical and accessible solution for learning risk-sensitive policies with minimal modification to the distributional RL algorithm, with an emphasis on the interpretability of the resulting decision-making process.
Disciplines :
Computer science
Author, co-author :
Théate, Thibaut ;  Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science
Ernst, Damien  ;  Université de Liège - ULiège > Montefiore Institute of Electrical Engineering and Computer Science
Language :
English
Title :
Risk-Sensitive Policy with Distributional Reinforcement Learning
Publication date :
30 June 2023
Journal title :
Algorithms
ISSN :
1999-4893
Publisher :
MDPI Open Access Publishing, Switzerland
Volume :
16
Issue :
325
Pages :
16
Peer reviewed :
Peer Reviewed verified by ORBi
Funders :
F.R.S.-FNRS - Fonds de la Recherche Scientifique [BE]
Available on ORBi :
since 30 December 2022

Statistics


Number of views
173 (16 by ULiège)
Number of downloads
72 (5 by ULiège)

Scopus citations®
 
2
Scopus citations®
without self-citations
2
OpenCitations
 
0

Bibliography


Similar publications



Contact ORBi