Fuzzy partition optimization for approximate fuzzy Q-iteration

Busoniu, Lucian; Ernst, Damien; Babuska, Robert; De Schutter, Bart

Download

Paper published in a book (Scientific congresses and symposiums)

Fuzzy partition optimization for approximate fuzzy Q-iteration

Busoniu, Lucian; Ernst, Damien; Babuska, Robert et al.

2008 • In Proceedings of the 17th IFAC World Congress (IFAC-08)

Peer reviewed

Permalink
https://hdl.handle.net/2268/13579

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

ifac08_a4.pdf

Publisher postprint (163.32 kB)

Download

All documents in ORBi are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

reinforcement learning; approximate Q-value iteration; fuzzy approximation; adaptvie basis functions; cross-entropy optimization

Abstract :

[en] Reinforcement learning (RL) is a widely used learning paradigm for adaptive agents. Because exact RL can only be applied to very simple problems, approximate algorithms are usually necessary in practice. Many algorithms for approximate RL rely on basis-function representations of the value function (or of the Q-function). Designing a good set of basis functions without any prior knowledge of the value function (or of the Q-function) can be a difficult task. In this paper, we propose instead a technique to optimize the shape of a constant number of basis functions for the approximate, fuzzy Q-iteration algorithm. In contrast to other approaches to adapt basis functions for RL, our optimization criterion measures the actual performance of the computed policies in the task, using simulation from a representative set of initial states. A complete algorithm, using cross-entropy optimization of triangular fuzzy membership functions, is given and applied to the car-on-the-hill example.

Disciplines :

Computer science

Author, co-author :

Busoniu, Lucian

Ernst, Damien ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation

Babuska, Robert

De Schutter, Bart

Language :

English

Title :

Fuzzy partition optimization for approximate fuzzy Q-iteration

Publication date :

2008

Event organizer :

17th IFAC World Congress (IFAC-08)

Event place :

Seoul, South Korea

Audience :

International

Main work title :

Proceedings of the 17th IFAC World Congress (IFAC-08)

Peer reviewed :

Peer reviewed

Funders :

F.R.S.-FNRS - Fonds de la Recherche Scientifique

Available on ORBi :

since 02 June 2009

Statistics

Number of views

140 (4 by ULiège)

Number of downloads

244 (1 by ULiège)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

Bibliography

Dimitri P. Bertsekas. Dynamic Programming and Optimal Control, volume 2. Athena Scientific, 2nd edition, 2001.
Lucian Busoniu, Damien Ernst, Bart De Schutter, and Robert Babuska. Fuzzy approximation for convergent model-based reinforcement learning. In Proceedings 2007 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE-07), pages 968-973, London, UK, 23-26 July 2007.
Damien Ernst, Pierre Geurts, and Louis Wehenkel. Tree-based batch mode reinforcement learning. Journal of Machine Learning Research, 6:503-556, 2005
Geoffrey Gordon. Stable function approximation in dynamic programming. In Proceedings Twelfth International Conference on Machine Learning (ICML-95), pages 261-268, Tahoe City, US, 9-12 July 1995.
Philipp W. Keller, Shie Mannor, and Doina Precup. Automatic basis function construction for approximate dynamic programming and reinforcement learning. In Proceedings Twenty-Third International Conference on Machine Learning (ICML-06), pages 449-456, Pittsburgh, US, 25-29 June 2006.
Sridhar Mahadevan. Samuel meets Amarel: Automating value function approximation using global state space analysis. In Proceedings 20th National Conference on Artificial Intelligence and the 17th Innovative Applications of Artificial Intelligence Conference (AAAI-05), pages 1000-1005, Pittsburgh, US, 9-13 July 2005.
Ishai Menache, Shie Mannor, and Nahum Shimkin. Basis function adaptation in temporal difference reinforcement learning. Annals of Operations Research, 134:215-238, 2005. (Pubitemid 40550047)
Remi Munos and Andrew Moore. Variable-resolution discretization in optimal control. Machine Learning, 49 (2-3):291-323, 2002.
Dirk Ormoneit and Saunak Sen. Kernel-based reinforcement learning. Machine Learning, 49(2-3):161-178, 2002.
Bohdana Ratitch and Doina Precup. Sparse distributed memories for on-line value-based reinforcement learning. In Proceedings 15th European Conference on Machine Learning (ECML-04), volume 3201 of Lecture Notes in Computer Science, pages 347-358, Pisa, Italy, 20-24 September 2004.
Reuven Y. Rubinstein and Dirk P. Kroese. The Cross-Entropy Method. A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning. Information Science and Statistics. Springer, 2004.
Richard S. Sutton and Andrew G. Barto. Reinforcement Learning: An Introduction. MIT Press, 1998.
Csaba Szepesvari and William D. Smart. Interpolation-based Q-learning. In Proceedings Twenty-First International Conference on Machine Learning (ICML-04), pages 791-798, Bannf, Canada, 4-8 July 2004.