Reference : A feature-based approach for best arm identification in the case of the Monte Carlo s...
Reports : Other
Engineering, computing & technology : Computer science
http://hdl.handle.net/2268/177924
A feature-based approach for best arm identification in the case of the Monte Carlo search algorithm discovery for one-player games
English
Taralla, David mailto [Université de Liège - ULiège > > > 2e an. master ingé. civ. info., fin. appr.]
Dec-2013
[en] monte carlo tree search ; optimisation ; mcts ; best arm identification
[en] The field of reinforcement learning recently received the contribution by Ernst et al. (2013) "Monte carlo search algorithm discovery for one player games" who introduced a new way to conceive completely new algorithms. Moreover, it brought an automatic method to find the best algorithm to use in a particular situation using a multi-arm bandit approach. We address here the problem of best arm identification. The main problem is that the generated algorithm space (ie. the arm space) can be quite large as the depth of the generated algorithms increases, so we just can't sample each algorithm the right number of times to be confident enough on the final choice (ie., to be sure the regret is minimized). We need therefore an optimized, scalable method for selecting the best algorithm from bigger spaces. The main idea is to see the reward of pulling an arm as a function of its features rather than directly exploring the algorithm space to find the best arm. This way, we demonstrate we are able to design a confident best arm identification algorithm, without suffering from the size of the space.
Researchers
http://hdl.handle.net/2268/177924

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Open access
rapport.pdfPublisher postprint1.93 MBView/Open

Additional material(s):

File Commentary Size Access
Open access
presentation.pdfThe supporting slides of this internship defense.861.48 kBView/Open

Bookmark and Share SFX Query

All documents in ORBi are protected by a user license.