A feature-based approach for best arm identification in the case of the Monte Carlo search algorithm discovery for one-player games

Taralla, David

Other (Reports)

Taralla, David

2013

Permalink
https://hdl.handle.net/2268/177924

Files (2)Send to Details Statistics Bibliography Similar publications

Files

Full Text

rapport.pdf

Publisher postprint (1.98 MB)

Download

Annexes

presentation.pdf

Publisher postprint (882.15 kB)

The supporting slides of this internship defense.

Download

All documents in ORBi are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

monte carlo tree search; optimisation; mcts; best arm identification

Abstract :

[en] The field of reinforcement learning recently received the contribution by Ernst et al. (2013) "Monte carlo search algorithm discovery for one player games" who introduced a new way to conceive completely new algorithms. Moreover, it brought an automatic method to find the best algorithm to use in a particular situation using a multi-arm bandit approach. We address here the problem of best arm identification. The main problem is that the generated algorithm space (ie. the arm space) can be quite large as the depth of the generated algorithms increases, so we just can't sample each algorithm the right number of times to be confident enough on the final choice (ie., to be sure the regret is minimized). We need therefore an optimized, scalable method for selecting the best algorithm from bigger spaces. The main idea is to see the reward of pulling an arm as a function of its features rather than directly exploring the algorithm space to find the best arm. This way, we demonstrate we are able to design a confident best arm identification algorithm, without suffering from the size of the space.

Disciplines :

Computer science

Author, co-author :

Taralla, David ; Université de Liège - ULiège > 2e an. master ingé. civ. info., fin. appr.

Language :

English

Title :

A feature-based approach for best arm identification in the case of the Monte Carlo search algorithm discovery for one-player games

Publication date :

December 2013

Available on ORBi :

since 03 February 2015

Statistics

Number of views

170 (25 by ULiège)

Number of downloads

194 (6 by ULiège)

More statistics

Bibliography

Similar publications

Sorry the service is unavailable at the moment. Please try again later.

Name

Provider / Domaine

Expiration

Description

JSESSIONID

Oracle Corporation

www.uliege.be

Session

General purpose platform session cookie, used by sites written in JSP. Usually used to maintain an anonymous user session by the server.

CookieScriptConsent

CookieScript

.uliege.be

1 year

This cookie is used by Cookie-Script.com service to remember visitor cookie consent preferences. It is necessary for Cookie-Script.com cookie banner to work properly.

Name

Provider / Domaine

Expiration

Description

_pk_id

InnoCraft Ltd

.uliege.be

1 year

Used to store a few details about the user such as the unique visitor ID

_pk_ses

InnoCraft Ltd

.uliege.be

30 minutes

Short lived cookies used to temporarily store data for the visit

_pk_ref

InnoCraft Ltd

.uliege.be

6 months

Used to store the attribution information, the referrer initially used to visit the website

Name	Provider / Domaine	Expiration	Description
JSESSIONID	Oracle Corporation www.uliege.be	Session	General purpose platform session cookie, used by sites written in JSP. Usually used to maintain an anonymous user session by the server.
CookieScriptConsent	CookieScript .uliege.be	1 year	This cookie is used by Cookie-Script.com service to remember visitor cookie consent preferences. It is necessary for Cookie-Script.com cookie banner to work properly.

Name	Provider / Domaine	Expiration	Description
_pk_id	InnoCraft Ltd .uliege.be	1 year	Used to store a few details about the user such as the unique visitor ID
_pk_ses	InnoCraft Ltd .uliege.be	30 minutes	Short lived cookies used to temporarily store data for the visit
_pk_ref	InnoCraft Ltd .uliege.be	6 months	Used to store the attribution information, the referrer initially used to visit the website