[en] We consider Markov reward processes and study OLS-LSTD, a framework for selecting basis functions from a set of candidates to obtain a sparse representation of the value function in the context of least squares temporal difference learning. To support efficient both updating and downdating operations, OLS-LSTD uses a biorthogonal representation for the selected basis vectors. Empirical comparisons with the recently proposed MP and LARS frameworks for LSTD are made.
Disciplines :
Computer science
Author, co-author :
Jung, Tobias ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Réseaux informatiques
Ernst, Damien ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Smart grids
Language :
English
Title :
Biorthogonalization Techniques for Least Squares Temporal Difference Learning