Capsule networks; Deep Q-learning; Deep reinforcement learning; Capsule network; Competitive performance; Convolutional neural network; Learning settings; Model free; Network-based architectures; Objects recognition; Q-learning; Translation invariants; Computer Science (all); Mathematics (all)
Abstract :
[en] In recent years, Capsule Networks (CapsNets) have achieved promising results in tasks such as object recognition thanks to their invariance characteristics towards pose and lighting. They have been proposed as an alternative to relational insensitive and translation invariant Convolutional Neural Networks (CNN). It has been empirically proven that CapsNets are capable of achieving competitive performance while requiring significantly fewer parameters. This is a desirable characteristic for Deep reinforcement learning which is known to be sample-inefficient during training. In this paper, we propose DCapsQN, a task-independent CapsNets-based architecture in the deep reinforcement learning setting. We experiment in the model-free reinforcement learning setting, more specifically in Deep Q-Learning using the Atari suite as the testbed of our analysis. To the best of our knowledge, this work constitutes the first CapsNets-based deep reinforcement learning architecture to learn state-action value functions without the need for task-specific adaptation. Our results show that, in this setting, DCapsQN requires 92% fewer parameters than the baseline. Moreover, despite their smaller size, the DCapsQN provides significant boosts in performance (score), ranging between 10%–77% while further stabilising the Deep Q-Learning. This is supported by our empirical results which shows that DCapsQN agents outperform the benchmark Double-DQN agent, with Prioritized experience replay, in eight out of the nine selected environments.
Research Center/Unit :
IDLab, university of Antwerpen
Disciplines :
Computer science
Author, co-author :
Singh, Akash ; Université de Liège - ULiège > HEC Liège : UER > UER Opérations : Systèmes d'information de gestion ; imec IDLab, University of Antwerpen, Antwerpen, Belgium
De Schepper, Tom; imec IDLab, University of Antwerpen, Antwerpen, Belgium
Mets, Kevin; imec IDLab, University of Antwerpen, Antwerpen, Belgium
Hellinckx, Peter; imec IDLab, University of Antwerpen, Antwerpen, Belgium
Oramas, José; imec IDLab, University of Antwerpen, Antwerpen, Belgium
Latré, Steven; imec IDLab, University of Antwerpen, Antwerpen, Belgium
Language :
English
Title :
Task Independent Capsule-Based Agents for Deep Q-Learning
Publication date :
2022
Event name :
BENELEARN
Event date :
10 to 12 November 2021
By request :
Yes
Audience :
International
Main work title :
Artificial Intelligence and Machine Learning
Main work alternative title :
[en] Communications in Computer and Information Science
Acknowledgement. This research received funding from the Flemish Government under the “Onderzoeksprogramma Artificiële Intelligentie (AI) Vlaanderen” programme.
Afshar, P., Heidarian, S., Naderkhani, F., Oikonomou, A., Plataniotis, K.N., Mohammadi, A.: COVID-CAPS: a capsule network-based framework for identification of COVID-19 cases from X-ray images. Pattern Recogn. Lett. 138, 638–643 (2020)
Afshar, P., Plataniotis, K.N., Mohammadi, A.: Capsule networks for brain tumor classification based on MRI images and course tumor boundaries. In: ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1368–1372, November 2019
Allioui, H., Sadgal, M., Elfazziki, A.: Deep MRI segmentation: a convolutional method applied to Alzheimer disease detection. Int. J. Adv. Comput. Sci. Appl. 10(11) (2019). https://doi.org/10.14569/IJACSA.2019.0101151
Andersen, P.A.: Deep reinforcement learning using capsules in advanced game environments. arXiv:1801.09597 [cs, stat], January 2018
Bahadori, M.T.: Spectral capsule networks, p. 5 (2018). https://openreview.net/forum?id=HJuMvYPaM
Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. 47, 253–279 (2013). https://doi.org/10.1613/jair.3912
Eck, D.J.: Introduction to Computer Graphics (2016)
Gou, S.Z., Liu, Y.: DQN with model-based exploration: efficient learning on environments with sparse rewards. arXiv:1903.09295 [cs, stat], March 2019
Hinton, G., Sabour, S., Frosst, N.: Matrix capsules with EM routing. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=HJWLfGWRb
Hubel, D.H., Wiesel, T.N.: Shape and arrangement of columns in cat’s striate cortex. J. Physiol. 165(3), 559–568 (1963). https://doi.org/10.1113/jphysiol.1963. sp007079
Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 655–665. Association for Computational Linguistics, Baltimore (2014). https://doi.org/10.3115/v1/P14-1062
Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jaśkowski, W.: ViZDoom: a doom-based AI research platform for visual reinforcement learning. In: 2016 IEEE Conference on Computational Intelligence and Games (CIG), pp. 1–8. IEEE (2016)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017). https://doi.org/10.1145/3065386
LaLonde, R., Bagci, U.: Capsules for object segmentation. arXiv:1804.04241 [cs, stat], April 2018
Martnez-Plumed, F., Hernandez-Orallo, J.: AI results for the Atari 2600 games: difficulty and discrimination using IRT. In: Evaluating General-Purpose AI, p. 6 (2017)
Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv:1312.5602 [cs], December 2013
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015). https://doi.org/10.1038/nature14236
Molnar, T., Culurciello, E.: Capsule network performance with autonomous navigation. Int. J. Artif. Intell. Appl. 11(1), 1–15 (2020). https://doi.org/10.5121/ijaia. 2020.11101
Pan, C., Velipasalar, S.: PT-CapsNet: a novel prediction-tuning capsule network suitable for deeper architectures. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11996–12005 (2021)
Pelli, D.G.: Crowding: a cortical constraint on object recognition. Curr. Opin. Neurobiol. 18(4), 445–451 (2008). https://doi.org/10.1016/j.conb.2008.09.008
Phaye, S.S.R., Sikka, A., Dhall, A., Bathula, D.: Dense and diverse capsule networks: making the capsules learn better. arXiv:1805.04001 [cs], May 2018
Rawlinson, D., Ahmed, A., Kowadlo, G.: Sparse unsupervised capsules generalize better. arXiv:1804.06094 [cs], April 2018
Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 3856–3866. Curran Associates, Inc. (2017). http://papers.nips.cc/paper/6975-dynamic-routing-between-capsules.pdf
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, 2–4 May 2016, Conference Track Proceedings (2016). http://arxiv.org/abs/1511.05952
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double Q-learning. arXiv:1509.06461 [cs], December 2015
Wen, X., Han, Z., Liu, X., Liu, Y.S.: Point2SpatialCapsule: aggregating features and spatial relationships of local regions on point clouds using spatial-aware capsules. IEEE Trans. Image Process. 29, 8855–8869 (2020)
Wu, Y., Ma, S., Zhang, D., Sun, J.: 3D capsule hand pose estimation network based on structural relationship information. Symmetry 12(10) (2020). https://doi.org/10.3390/sym12101636. https://www.mdpi.com/2073-8994/12/10/1636