Reference : Online Sparse Bandit for Card Games
Scientific congresses and symposiums : Paper published in a journal
Engineering, computing & technology : Computer science
Online Sparse Bandit for Card Games
Lupien St-Pierre, David mailto [Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Dép. d'électric., électron. et informat. (Inst.Montefiore) >]
Louveaux, Quentin mailto [Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Système et modélisation : Optimisation discrète >]
Teytaud, Olivier [> >]
Advance in Computer Games
13th Conference in Advances in Computer Games
September 2011
[en] Finding an approximation of a Nash equilibria in matrix games is an important topic that reaches beyond the strict application to matrix games. A bandit algorithm commonly used to approximate a Nash equilibrium is EXP3. However, the solution to many problems is often sparse, yet EXP3 inherently fails to exploit this property. To the knowledge of the authors, there exist only an offline truncation to tackle such issue. In this paper, we propose a variation of EXP3 to exploit the fact that solution is sparse by dynamically removing arms; the resulting algorithm empirically performs better than previous versions. We apply the resulting algorithm to a MCTS program for the Urban Rivals card game.

File(s) associated to this reference

Fulltext file(s):

Restricted access
acg2011.pdfPublisher postprint164.2 kBRequest copy

Bookmark and Share SFX Query

All documents in ORBi are protected by a user license.