Decision Making from Confidence Measurement on the Reward Growth using Supervised Learning: A Study Intended for Large-Scale Video Games

Taralla, David; Qiu, Zixiao; Sutera, Antonio; Fonteneau, Raphaël; Ernst, Damien

doi:10.5220/0005666202640271

Paper published in a book (Scientific congresses and symposiums)

Decision Making from Confidence Measurement on the Reward Growth using Supervised Learning: A Study Intended for Large-Scale Video Games

Taralla, David; Qiu, Zixiao; Sutera, Antonio et al.

2016 • In Proceedings of the 8th International Conference on Agents and Artificial Intelligence (ICAART 2016) - Volume 2

Peer reviewed

Permalink
https://hdl.handle.net/2268/191463

DOI
10.5220/0005666202640271

Files (2)Send to Details Statistics Bibliography Similar publications

Files

Full Text

camready.pdf

Author postprint (481.1 kB)

Download

Annexes

poster.pdf

Publisher postprint (458.26 kB)

Poster

Download

All documents in ORBi are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Artificial Intelligence; Decision Making; Video Games; Hearthstone; Supervised Learning; ExtraTrees

Abstract :

[en] Video games have become more and more complex over the past decades. Today, players wander in visually and option- rich environments, and each choice they make, at any given time, can have a combinatorial number of consequences. However, modern artificial intelligence is still usually hard-coded, and as the game environments become increasingly complex, this hard-coding becomes exponentially difficult. Recent research works started to let video game autonomous agents learn instead of being taught, which makes them more intelligent. This contribution falls under this very perspective, as it aims to develop a framework for the generic design of autonomous agents for large-scale video games. We consider a class of games for which expert knowledge is available to define a state quality function that gives how close an agent is from its objective. The decision making policy is based on a confidence measurement on the growth of the state quality function, computed by a supervised learning classification model. Additionally, no stratagems aiming to reduce the action space are used. As a proof of concept, we tested this simple approach on the collectible card game Hearthstone and obtained encouraging results.

Disciplines :

Computer science

Author, co-author :

Taralla, David ; Université de Liège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Smart grids

Qiu, Zixiao ; Université de Liège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Smart grids

Sutera, Antonio ; Université de Liège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Algorith. des syst. en interaction avec le monde physique

Fonteneau, Raphaël ; Université de Liège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Smart grids

Ernst, Damien ; Université de Liège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Smart grids

Language :

English

Title :

Decision Making from Confidence Measurement on the Reward Growth using Supervised Learning: A Study Intended for Large-Scale Video Games

Publication date :

February 2016

Event name :

8th International Conference on Agents and Artificial Intelligence

Event organizer :

INSTICC - Institute for Systems and Technologies of Information, Control and Communication

Event place :

Rome, Italy

Event date :

24-26 February 2016

Audience :

International

Main work title :

Proceedings of the 8th International Conference on Agents and Artificial Intelligence (ICAART 2016) - Volume 2

ISBN/EAN :

978-989-758-172-4

Pages :

264-271

Peer reviewed :

Peer reviewed

Funders :

F.R.S.-FNRS - Fonds de la Recherche Scientifique
FRIA - Fonds pour la Formation à la Recherche dans l'Industrie et dans l'Agriculture

Available on ORBi :

since 15 January 2016

Statistics

Number of views

874 (32 by ULiège)

Number of downloads

669 (17 by ULiège)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenCitations

OpenAlex citations

publications

supporting

mentioning

contrasting

Smart Citations

Citing PublicationsSupportingMentioningContrasting

View Citations

See how this article has been cited at scite.ai

scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

Bibliography

Bauckhage, C., Thurau, C., and Sagerer, G. (2003). Learning human-like opponent behavior for interactive computer games. In Pattern Recognition, pages 148-155. Springer.
Breiman, L. (2001). Random forests. Machine learning, 45(1):5-32.
Browne, C. B., Powley, E., Whitehouse, D., Lucas, S. M., Cowling, P. I., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., and Colton, S. (2012). A survey of monte carlo tree search methods. Computational Intelligence and AI in Games, IEEE Transactions on, 4(1):1-43.
Bunescu, R., Ge, R., Kate, R. J., Marcotte, E. M., Mooney, R. J., Ramani, A. K., and Wong, Y. W. (2005). Comparative experiments on learning information extractors for proteins and their interactions. Artificial intelligence in medicine, 33(2):139-155.
Cowling, P. I., Ward, C. D., and Powley, E. J. (2012). Ensemble determinization in monte carlo tree search for the imperfect information card game magic: The gathering. Computational Intelligence and AI in Games, IEEE Transactions on, 4(4):241-257.
Craven, J. B. M. (2005). Markov networks for detecting overlapping elements in sequence data. Advances in Neural Information Processing Systems, 17:193.
Davis, J. and Goadrich, M. (2006). The relationship between precision-recall and roc curves. In Proceedings of the 23rd international conference on Machine learning, pages 233-240. ACM.
Frandsen, F., Hansen, M., Sørensen, H., Sørensen, P., Nielsen, J. G., and Knudsen, J. S. (2010). Predicting player strategies in real time strategy games. PhD thesis, Masters thesis.
Gemine, Q., Safadi, F., Fonteneau, R., and Ernst, D. (2012). Imitative learning for real-time strategy games. In Computational Intelligence and Games (CIG), 2012 IEEE Conference on, pages 424-429. IEEE.
Geurts, P., Ernst, D., and Wehenkel, L. (2006). Extremely randomized trees. Machine learning, 63(1):3-42.
Goadrich, M., Oliphant, L., and Shavlik, J. (2004). Learning ensembles of first-order clauses for recallprecision curves: A case study in biomedical information extraction. In Inductive logic programming, pages 98-115. Springer.
Gorman, B. and Humphrys, M. (2007). Imitative learning of combat behaviours in first-person computer games. Proceedings of CGAMES.
Hanley, J. A. and McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic (roc) curve. Radiology, 143(1):29-36.
Lee, C.-S., Wang, M.-H., Chaslot, G., Hoock, J.-B., Rimmel, A., Teytaud, F., Tsai, S.-R., Hsu, S.-C., and Hong, T.-P. (2009). The computational intelligence of mogo revealed in taiwan's computer go tournaments. Computational Intelligence and AI in Games, IEEE Transactions on, 1(1):73-89.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825-2830.
Provost, F. J., Fawcett, T., and Kohavi, R. (1998). The case against accuracy estimation for comparing induction algorithms. In ICML, volume 98, pages 445-453.
Rimmel, A., Teytaud, F., Lee, C.-S., Yen, S.-J., Wang, M.-H., and Tsai, S.-R. (2010). Current frontiers in computer go. Computational Intelligence and AI in Games, IEEE Transactions on, 2(4):229-238.
Safadi, F., Fonteneau, R., and Ernst, D. (2015). Artificial intelligence in video games: Towards a unified framework. International Journal of Computer Games Technology, 2015.
Sailer, F., Buro, M., and Lanctot, M. (2007). Adversarial planning through strategy simulation. In Computational Intelligence and Games, 2007. CIG 2007. IEEE Symposium on, pages 80-87. IEEE.
Soemers, D. (2014). Tactical planning using mcts in the game of starcraft1. Master's thesis, Maastricht University.
Sutera, A. (2013). Characterization of variable importance measures derived from decision trees. Master's thesis, University of Liège.
van den Herik, H. J. (2010). The drosophila revisited. ICGA journal, 33(2):65-66.
Ward, C. D. and Cowling, P. I. (2009). Monte carlo search applied to card selection in magic: The gathering. In Computational Intelligence and Games, 2009. CIG 2009. IEEE Symposium on, pages 9-16. IEEE.

Name	Provider / Domaine	Expiration	Description
JSESSIONID	Oracle Corporation www.uliege.be	Session	General purpose platform session cookie, used by sites written in JSP. Usually used to maintain an anonymous user session by the server.
CookieScriptConsent	CookieScript .uliege.be	1 year	This cookie is used by Cookie-Script.com service to remember visitor cookie consent preferences. It is necessary for Cookie-Script.com cookie banner to work properly.

Name	Provider / Domaine	Expiration	Description
_pk_id	InnoCraft Ltd .uliege.be	1 year	Used to store a few details about the user such as the unique visitor ID
_pk_ses	InnoCraft Ltd .uliege.be	30 minutes	Short lived cookies used to temporarily store data for the visit
_pk_ref	InnoCraft Ltd .uliege.be	6 months	Used to store the attribution information, the referrer initially used to visit the website