Continuous-state reinforcement learning with fuzzy approximation

Busoniu, Lucian; Ernst, Damien; De Schutter, Bart; Babuska, Robert

doi:10.1007/978-3-540-77949-0_3

Download

Paper published in a book (Scientific congresses and symposiums)

Continuous-state reinforcement learning with fuzzy approximation

Busoniu, Lucian; Ernst, Damien; De Schutter, Bart et al.

2008 • In Tuyls, K.; Nowé, A.; Guessoum, Z. et al. (Eds.) Adaptive Agents and Multi-Agent Systems III, Adaptation and Multi-Agent Learning

Peer reviewed

Permalink
https://hdl.handle.net/2268/13592

DOI
10.1007/978-3-540-77949-0_3

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

lnai08.pdf

Publisher postprint (1.49 MB)

Download

All documents in ORBi are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

reinforcement learning; Q-value iteration; fuzzy approximators

Abstract :

[en] Reinforcement learning (RL) is a widely used learning paradigm for adaptive agents. There exist several convergent and consistent RL algorithms which have been intensively studied. In their original form, these algorithms require that the environment states and agent actions take values in a relatively small discrete set. Fuzzy representations for approximate, model-free RL have been proposed in the literature for the more difficult case where the state-action space is continuous. In this work, we propose a fuzzy approximation architecture similar to those previously used for Q-learning, but we combine it with the model-based Q-value iteration algorithm. We prove that the resulting algorithm converges. We also give a modified, asynchronous variant of the algorithm that converges at least as fast as the original version. An illustrative simulation example is provided.

Disciplines :

Computer science

Author, co-author :

Busoniu, Lucian

Ernst, Damien ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation

De Schutter, Bart

Babuska, Robert

Language :

English

Title :

Continuous-state reinforcement learning with fuzzy approximation

Publication date :

2008

Audience :

International

Main work title :

Adaptive Agents and Multi-Agent Systems III, Adaptation and Multi-Agent Learning

Editor :

Tuyls, K.

Nowé, A.

Guessoum, Z.

Kudenko, D.

ISBN/EAN :

978-3-540-77947-6

Collection name :

Lecture Notes in Artificial Intelligence, Vol. 4865

Pages :

27-43

Peer review/Selection committee :

Peer reviewed

Funders :

F.R.S.-FNRS - Fonds de la Recherche Scientifique

Available on ORBi :

since 02 June 2009

Statistics

Number of views

270 (9 by ULiège)

Number of downloads

385 (3 by ULiège)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenAlex citations

Bibliography

Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Bertsekas, D.P.: Dynamic Programming and Optimal Control, 2nd edn., vol. 2. Athena Scientific (2001)
Watkins, C.J.C.H., Dayan, P.: Q-learning. Machine Learning 8, 279-292 (1992)
Glorennec, P.Y.: Reinforcement learning: An overview. In: ESIT 2000. Proceedings European Symposium on Intelligent Techniques, Aachen, Germany, September 14-15, 2000, pp. 17-35 (2000)
Horiuchi, T., Fujino, A., Katai, O., Sawaragi, T.: Fuzzy interpolation-based Qlearning with continuous states and actions. In: FUZZ-IEEE 1996. Proceedings 5th IEEE International Conference on Fuzzy Systems, New Orleans, US, September 8-11, 1996, pp. 594-600 (1996)
Jouffe, L.: Fuzzy inference system learning by reinforcement methods. IEEE Transactions on Systems, Man, and Cybernetics-Part C: Applications and Reviews 28(3), 338-355 (1998)
Berenji, H.R., Khedkar, P.: Learning and tuning fuzzy logic controllers through reinforcements. IEEE Transactions on Neural Networks 3(5), 724-740 (1992)
Berenji, H.R., Vengerov, D.: A convergent actor-critic-based FRL algorithm with application to power management of wireless transmitters. IEEE Transactions on Fuzzy Systems 11(4), 478-485 (2003)
Vengerov, D., Bambos, N., Berenji, H.R.: A fuzzy reinforcement learning approach to power control in wireless transmitters. IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics 35(4), 768-778 (2005)
Lin, C.K.: A reinforcement learning adaptive fuzzy controller for robots. Fuzzy Sets and Systems 137, 339-352 (2003)
Tsitsiklis, J.N., Van Roy, B.: Feature-based methods for large scale dynamic programming. Machine Learning 22(1-3), 59-94 (1996)
Szepesvári, C., Munos, R.: Finite time bounds for sampling based fitted value iteration. In: ICML 2005. Proceedings Twenty-Second International Conference on Machine Learning, Bonn, Germany, August 7-11, 2005, pp. 880-887 (2005)
Gordon, G.: Stable function approximation in dynamic programming. In: ICML 1995. Proceedings Twelfth International Conference on Machine Learning, Tahoe City, US, July 9-12, 1995, pp. 261-268 (1995)
Wiering, M.: Convergence and divergence in standard and averaging reinforcement learning. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 477-488. Springer, Heidelberg (2004)
Ormoneit, D., Sen, S.: Kernel-based reinforcement learning. Machine Learning 49(2-3), 161-178 (2002)
Ernst, D., Geurts, P., Wehenkel, L.: Tree-based batch mode reinforcement learning. Journal of Machine Learning Research 6, 503-556 (2005)
Szepesvári, C., Smart, W.D.: Interpolation-based Q-learning. In: ICML 2004. Proceedings Twenty-First International Conference on Machine Learning, Bannf, Canada, July 4-8, 2004 (2004)
Singh, S.P., Jaakkola, T., Jordan, M.I.: Reinforcement learning with soft state aggregation. In: NIPS 1994. Advances in Neural Information Processing Systems 7, Denver, US, pp. 361-368 (1994)
Ernst, D.: Near Optimal Closed-loop Control. Application to Electric Power Systems. PhD thesis, University of Liège, Belgium (March 2003)
Munos, R., Moore, A.: Variable-resolution discretization in optimal control. Machine Learning 49(2-3), 291-323 (2002)
Sherstov, A., Stone, P.: Function approximation via tile coding: Automating parameter choice. In: Zucker, J.-D., Saitta, L. (eds.) SARA 2005. LNCS (LNAI), vol. 3607, pp. 194-205. Springer, Heidelberg (2005)