reinforcement learning; large language models; RL; LLMs
Abstract :
[en] This course is part of the 2025 course titled INFO8003-1 Reinforcement Learning. It briefly reminds the reader about key concepts regarding Large Language Models (LLMs) and reinforcement learning and then explains how to apply them to improve LLMs in various capacities. In particular, it covers the use of reward models and direct preference optimisation in conjunction with preference data.
Research Center/Unit :
Montefiore Institute - Montefiore Institute of Electrical Engineering and Computer Science - ULiège
Disciplines :
Computer science
Author, co-author :
Pirenne, Lize ; Université de Liège - ULiège > Département d'électricité, électronique et informatique (Institut Montefiore) > Smart grids
Language :
English
Title :
Reinforcement Learning and Large Language Models
Publication date :
2025
Number of pages :
47
Course title or code :
INFO8003-1 Reinforcement Learning
Institution :
ULiège - Université de Liège [School of Engineering], Liège, Belgium