Task Independent Capsule-Based Agents for Deep Q-Learning

Singh, Akash; De Schepper, Tom; Mets, Kevin; Hellinckx, Peter; Oramas, José; Latré, Steven

doi:10.1007/978-3-030-93842-0_4

No full text

Paper published in a book (Scientific congresses and symposiums)

Task Independent Capsule-Based Agents for Deep Q-Learning

Singh, Akash; De Schepper, Tom; Mets, Kevin et al.

2022 • In Artificial Intelligence and Machine Learning

Peer reviewed Dataset

Permalink
https://hdl.handle.net/2268/288478

DOI
10.1007/978-3-030-93842-0_4

Files (0)Send to Details Statistics Bibliography Similar publications

Files

Full Text

No document available.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Capsule networks; Deep Q-learning; Deep reinforcement learning; Capsule network; Competitive performance; Convolutional neural network; Learning settings; Model free; Network-based architectures; Objects recognition; Q-learning; Translation invariants; Computer Science (all); Mathematics (all)

Abstract :

[en] In recent years, Capsule Networks (CapsNets) have achieved promising results in tasks such as object recognition thanks to their invariance characteristics towards pose and lighting. They have been proposed as an alternative to relational insensitive and translation invariant Convolutional Neural Networks (CNN). It has been empirically proven that CapsNets are capable of achieving competitive performance while requiring significantly fewer parameters. This is a desirable characteristic for Deep reinforcement learning which is known to be sample-inefficient during training. In this paper, we propose DCapsQN, a task-independent CapsNets-based architecture in the deep reinforcement learning setting. We experiment in the model-free reinforcement learning setting, more specifically in Deep Q-Learning using the Atari suite as the testbed of our analysis. To the best of our knowledge, this work constitutes the first CapsNets-based deep reinforcement learning architecture to learn state-action value functions without the need for task-specific adaptation. Our results show that, in this setting, DCapsQN requires 92% fewer parameters than the baseline. Moreover, despite their smaller size, the DCapsQN provides significant boosts in performance (score), ranging between 10%–77% while further stabilising the Deep Q-Learning. This is supported by our empirical results which shows that DCapsQN agents outperform the benchmark Double-DQN agent, with Prioritized experience replay, in eight out of the nine selected environments.

Research Center/Unit :

IDLab, university of Antwerpen

Disciplines :

Computer science

Author, co-author :

Singh, Akash ; Université de Liège - ULiège > HEC Liège : UER > UER Opérations : Systèmes d'information de gestion ; imec IDLab, University of Antwerpen, Antwerpen, Belgium

De Schepper, Tom; imec IDLab, University of Antwerpen, Antwerpen, Belgium

Mets, Kevin; imec IDLab, University of Antwerpen, Antwerpen, Belgium

Hellinckx, Peter; imec IDLab, University of Antwerpen, Antwerpen, Belgium

Oramas, José; imec IDLab, University of Antwerpen, Antwerpen, Belgium

Latré, Steven; imec IDLab, University of Antwerpen, Antwerpen, Belgium

Language :

English

Title :

Task Independent Capsule-Based Agents for Deep Q-Learning

Publication date :

2022

Event name :

BENELEARN

Event date :

10 to 12 November 2021

By request :

Yes

Audience :

International

Main work title :

Artificial Intelligence and Machine Learning

Main work alternative title :

[en] Communications in Computer and Information Science

Publisher :

Springer Nature, Switzerland

Peer reviewed :

Peer reviewed

Additional URL :

https://gym.openai.com/envs/#classic_control

Funders :

FlandersAI

Funding text :

Acknowledgement. This research received funding from the Flemish Government under the “Onderzoeksprogramma Artificiële Intelligentie (AI) Vlaanderen” programme.

Data Set :

Gym Atari

Available on ORBi :

since 28 February 2022

Statistics

Number of views

93 (16 by ULiège)

Number of downloads

0 (0 by ULiège)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenCitations

OpenAlex citations

See more details

publications

supporting

mentioning

contrasting

Smart Citations

Citing PublicationsSupportingMentioningContrasting

View Citations

See how this article has been cited at scite.ai

scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

Bibliography

Afshar, P., Heidarian, S., Naderkhani, F., Oikonomou, A., Plataniotis, K.N., Mohammadi, A.: COVID-CAPS: a capsule network-based framework for identification of COVID-19 cases from X-ray images. Pattern Recogn. Lett. 138, 638–643 (2020)
Afshar, P., Plataniotis, K.N., Mohammadi, A.: Capsule networks for brain tumor classification based on MRI images and course tumor boundaries. In: ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1368–1372, November 2019
Allioui, H., Sadgal, M., Elfazziki, A.: Deep MRI segmentation: a convolutional method applied to Alzheimer disease detection. Int. J. Adv. Comput. Sci. Appl. 10(11) (2019). https://doi.org/10.14569/IJACSA.2019.0101151
Andersen, P.A.: Deep reinforcement learning using capsules in advanced game environments. arXiv:1801.09597 [cs, stat], January 2018
Bahadori, M.T.: Spectral capsule networks, p. 5 (2018). https://openreview.net/forum?id=HJuMvYPaM
Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. 47, 253–279 (2013). https://doi.org/10.1613/jair.3912
Eck, D.J.: Introduction to Computer Graphics (2016)
Gou, S.Z., Liu, Y.: DQN with model-based exploration: efficient learning on environments with sparse rewards. arXiv:1903.09295 [cs, stat], March 2019
Hinton, G., Sabour, S., Frosst, N.: Matrix capsules with EM routing. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=HJWLfGWRb
Huang, W., Zhou, F.: DA-CapsNet: dual attention mechanism capsule network. Sci. Rep. 10(1), 1–13 (2020)
Hubel, D.H., Wiesel, T.N.: Shape and arrangement of columns in cat’s striate cortex. J. Physiol. 165(3), 559–568 (1963). https://doi.org/10.1113/jphysiol.1963. sp007079
Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 655–665. Association for Computational Linguistics, Baltimore (2014). https://doi.org/10.3115/v1/P14-1062
Kempka, M., Wydmuch, M., Runc, G., Toczek, J., Jaśkowski, W.: ViZDoom: a doom-based AI research platform for visual reinforcement learning. In: 2016 IEEE Conference on Computational Intelligence and Games (CIG), pp. 1–8. IEEE (2016)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017). https://doi.org/10.1145/3065386
LaLonde, R., Bagci, U.: Capsules for object segmentation. arXiv:1804.04241 [cs, stat], April 2018
Liao, H.: CapsNet-Tensorflow (2018). https://github.com/naturomics/CapsNet-Tensorflow/blob/master/imgs/capsuleVSneuron.png
Martnez-Plumed, F., Hernandez-Orallo, J.: AI results for the Atari 2600 games: difficulty and discrimination using IRT. In: Evaluating General-Purpose AI, p. 6 (2017)
Mnih, V., et al.: Playing Atari with deep reinforcement learning. arXiv:1312.5602 [cs], December 2013
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015). https://doi.org/10.1038/nature14236
Molnar, T., Culurciello, E.: Capsule network performance with autonomous navigation. Int. J. Artif. Intell. Appl. 11(1), 1–15 (2020). https://doi.org/10.5121/ijaia. 2020.11101
Pan, C., Velipasalar, S.: PT-CapsNet: a novel prediction-tuning capsule network suitable for deeper architectures. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11996–12005 (2021)
Pelli, D.G.: Crowding: a cortical constraint on object recognition. Curr. Opin. Neurobiol. 18(4), 445–451 (2008). https://doi.org/10.1016/j.conb.2008.09.008
Phaye, S.S.R., Sikka, A., Dhall, A., Bathula, D.: Dense and diverse capsule networks: making the capsules learn better. arXiv:1805.04001 [cs], May 2018
Rawlinson, D., Ahmed, A., Kowadlo, G.: Sparse unsupervised capsules generalize better. arXiv:1804.06094 [cs], April 2018
Sabour, S., Frosst, N., Hinton, G.E.: Dynamic routing between capsules. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 3856–3866. Curran Associates, Inc. (2017). http://papers.nips.cc/paper/6975-dynamic-routing-between-capsules.pdf
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, 2–4 May 2016, Conference Track Proceedings (2016). http://arxiv.org/abs/1511.05952
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double Q-learning. arXiv:1509.06461 [cs], December 2015
Wen, X., Han, Z., Liu, X., Liu, Y.S.: Point2SpatialCapsule: aggregating features and spatial relationships of local regions on point clouds using spatial-aware capsules. IEEE Trans. Image Process. 29, 8855–8869 (2020)
Wu, Y., Ma, S., Zhang, D., Sun, J.: 3D capsule hand pose estimation network based on structural relationship information. Symmetry 12(10) (2020). https://doi.org/10.3390/sym12101636. https://www.mdpi.com/2073-8994/12/10/1636

Similar publications

Sorry the service is unavailable at the moment. Please try again later.

Name	Provider / Domaine	Expiration	Description
JSESSIONID	Oracle Corporation www.uliege.be	Session	General purpose platform session cookie, used by sites written in JSP. Usually used to maintain an anonymous user session by the server.
CookieScriptConsent	CookieScript .uliege.be	1 year	This cookie is used by Cookie-Script.com service to remember visitor cookie consent preferences. It is necessary for Cookie-Script.com cookie banner to work properly.

Name	Provider / Domaine	Expiration	Description
_pk_id	InnoCraft Ltd .uliege.be	1 year	Used to store a few details about the user such as the unique visitor ID
_pk_ses	InnoCraft Ltd .uliege.be	30 minutes	Short lived cookies used to temporarily store data for the visit
_pk_ref	InnoCraft Ltd .uliege.be	6 months	Used to store the attribution information, the referrer initially used to visit the website