Planning under uncertainty, ensembles of disturbance trees and kernelized discrete action spaces

Defourny, Boris; Ernst, Damien; Wehenkel, Louis

doi:10.1109/ADPRL.2009.4927538

Download

Paper published in a book (Scientific congresses and symposiums)

Planning under uncertainty, ensembles of disturbance trees and kernelized discrete action spaces

Defourny, Boris; Ernst, Damien; Wehenkel, Louis

2009 • In Proceedings of the IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL-09)

Peer reviewed

Permalink
https://hdl.handle.net/2268/14318

DOI
10.1109/ADPRL.2009.4927538

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

adprl09_PuuEdtKdas.pdf

Publisher postprint (199.74 kB)

Download

All documents in ORBi are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

planning under uncertainty; reinforcement learning; multi-stage stochastic programming; disturbance trees

Abstract :

[en] Optimizing decisions on an ensemble of incomplete disturbance trees and aggregating their first stage decisions has been shown as a promising approach to (model-based) planning under uncertainty in large continuous action spaces and in small discrete ones. The present paper extends this approach and deals with large but highly structured action spaces, through a kernel-based aggregation scheme. The technique is applied to a test problem with a discrete action space of 6561 elements adapted from the NIPS 2005 SensorNetwork benchmark.

Disciplines :

Computer science

Author, co-author :

Defourny, Boris ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation

Ernst, Damien ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation

Wehenkel, Louis ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation

Language :

English

Title :

Planning under uncertainty, ensembles of disturbance trees and kernelized discrete action spaces

Publication date :

2009

Event name :

IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL-09)

Event place :

Nashville, United States

Event date :

March 30 - April 2, 2009

Audience :

International

Main work title :

Proceedings of the IEEE International Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL-09)

ISBN/EAN :

978-1-4244-2761-1

Pages :

145-152

Peer review/Selection committee :

Peer reviewed

Funders :

F.R.S.-FNRS - Fonds de la Recherche Scientifique

Available on ORBi :

since 11 June 2009

Statistics

Number of views

234 (11 by ULiège)

Number of downloads

290 (7 by ULiège)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenAlex citations

Bibliography

J. Birge and F. Louveaux, Introduction to Stochastic Programming. New York: Springer-Verlag, 1997.
J. Dupacova, .Stability and sensitivity analysis for stochastic programming. Annals of Operations Research, Vol. 27, pp. 115-142, 1990.
J. Dupacova, G. Consigli, and S. Wallace, .Scenarios for multistage stochastic programs, . Annals of Operations Research, Vol. 100, pp. 25-53,2000. (Pubitemid 33304003)
R. Hochreiter and G.P flug,Financial scenario generation for stochastic multi-stage decision processes as facility location problems, Annals of Operations Research, Vol. 152, pp. 257-272, 2007.
R. Rockafellar, Optimization under uncertainty, Lecture Notes, University of Washington, 2001.
D. Bertsekas, Dynamic Programming and Optimal Control, 3rd ed. Belmont, MA: Athena Scientific, 2005.
S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge University Press, 2004.
M. Dempster,Sequential importance sampling algorithms for dynamic stochastic programming,Annals of Operations Research, Vol. 84, pp. 153-184,1998.
A. Shapiro, "Monte Carlo sampling methods," in Stochastic Programming. Handbooks in Operations Research and Management Science, A. Ruszczyński and A. Shapiro, Eds. Amsterdam: Elsevier, 2003, Vol. 10, pp. 353-425.
H. Heitsch, W. Römisch, and C. Strugarek. Stability of multistage stochastic programs,SIAM Journal on Optimization, Vol. 17, no. 2, pp. 511-525, 2006. (Pubitemid 46798078)
P. Carpentier, G. Cohen, and J. Culioli,Stochastic optimization of unit commitment: A new decomposition framework, IEEE Transactions on Power Systems, Vol. 11, no. 2, pp. 1067-1073, 1996. (Pubitemid 126781257)
J. Kallrath, P. Pardalos, S. Rebennack, and M. Scheidt, Eds., Optimization in the Energy Industry, ser. Energy Systems.Springer, 2008.
D. Carino, R. Myers, and W. Ziemba,Concepts, technical issues and uses of the Russel-Yasuda Kasai financial planning model, Operations Research, Vol. 46, pp. 450-462, 1998. (Pubitemid 128665756)
A. Zenios and W. Ziemba, Eds., Applications and Case Studies, ser. Handbook of Asset and Liability Management. North Holland, 2007, Vol. II.
J. Gondzio and A. Grothy, .Solving nonlinear financial planning problems with 109 decision variables on massively parallel architectures . in Computational Finance and its Applications II, WIT Transactions on Modelling and Simulation, M. Costantino and C. Brebbia, Eds., Vol. 43, 2006. (Pubitemid 350107881)
A. Eichhorn and W. Römisch,Polyhedral risk measures in stochastic programming,SIAM Journal on Optimization, Vol. 16, no. 1, 2005.
L. Cheng, E. Subrahmanian, and A. Westerberg, A comparison of optimal control and stochastic programming from a formulation and computation perspective,Computers and Chemical Engineering, Vol. 29, no. 1, pp. 149-164, 2004. (Pubitemid 39572061)
D. Kuhn, .Aggregation and discretization in multistage stochastic programming,Mathematical Programming, Ser. A, Vol. 113, pp. 61-94, 2008.
K. Høyland and S. Wallace,Generating scenario trees for multistage decision problems, Management Science, Vol. 47, no. 2, pp. 295-307, 2001. (Pubitemid 34192765)
J. Dupacova, N. Gröwe-Kuska, and W. Römisch,Scenario reduction in stochastic programming, An approach using probability metrics, Math. Program., Ser. A, Vol. 95, pp. 493-511, 2003. (Pubitemid 44769765)
S. Vigerske and I. Nowak,Adaptive discretization of convex multistage stochastic programs,Math. Meth. Oper. Res., Vol. 65, pp. 365-383, 2007. (Pubitemid 46543654)
M. Morari and J. Lee, Model predictive control: past, present and future, Computers and Chemical Engineering, Vol. 23, pp. 667-682, 1999. (Pubitemid 29210323)
R. Kouwenberg, Scenario generation and stochastic programming models for asset liability management, European Journal of Operational Research, Vol. 134, no. 2, pp. 279-292, 2001.
N. Sahidinis,Optimization under uncertainty: state-of-the-art and opportunities, Computers and Chemical Engineering, Vol. 28, no. 6-7, pp. 971-983, 2004. (Pubitemid 38520953)
B. Defourny, D. Ernst, and L. Wehenkel, Lazy planning under uncertainty by optimizing decisions on an ensemble of incomplete disturbance trees, in Recent Advances in Reinforcement Learning, 8th European Workshop, EWRL'08,LNCS (LNAI) 5323. Springer, 2008.
R. Schapire,The strength of weak learnability, Machine Learning, Vol. 5, no. 2, pp. 197-227, 1990.
L. Breiman, Bagging predictors, Machine Learning, Vol. 24, no. 2, pp. 123-140, 1996. (Pubitemid 126724382)
B. Defourny and L. Wehenkel,Averaging decisions from an ensemble of scenario trees: a validation on newsvendor problems, 2008, submitted.
J. Shawe-Taylor and N. Cristianini, Kernel Methods for Pattern Analysis. Cambridge University Press, 2004.
R. Rubinstein and D. Kroese, The Cross-Entropy Method. A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning, ser. Information Science and Statistics. Springer, 2004.
S. Ali, S. Koenig, and M. Tambe,Preprocessing techniques for accelerating the DCOP algorithm ADOPT,.in AAMAS, 2005.
R. Sutton, Generalization in reinforcement learning: successful examples using sparse coarse coding, Advances in Neural Information Processing Systems, Vol. 8, pp. 1038-1044, 1996.
D. Ernst, P. Geurts, and L. Wehenkel,Tree-based batch mode reinforcement learning,Journal of Machine Learning Research, Vol. 6, pp. 503-556, April 2005. (Pubitemid 40958851)
B. Brügmann, Monte Carlo Go, Syracuse University, Tech. Rep., 1993.
M. Kearns, Y. Mansour, and A. Ng, A sparse sampling algorithm for near-optimal planning in large Markov decision processes, Machine Learning, Vol. 49, no. 2-3, pp. 193-208, 2002.
S. Rachev and W. Römisch,Quantitative stability in stochastic programming: The method of probability metrics, Mathematics of Operations Research, Vol. 27, no. 4, pp. 792-818, 2002.
W. Römisch, Stability of stochastic programming problems, in Stochastic Programming. Handbooks in Operations Research and Management Science, A. Ruszczyński and A. Shapiro, Eds. Amsterdam: Elsevier, 2003, Vol. 10, pp. 483-554.
R. Rockafellar and R.-B. Wets, Scenarios and policy aggregation in optimization under uncertainty, Mathematics of Operations Research, Vol. 16, pp. 119-147, 1991.
Y. Nesterov and J.-P. Vial,Confidence level solutions for stochastic programming, Automatica, Vol. 44, no. 6, pp. 1559-1568, 2008.
J. Weston, B. Schoelkopf, and O. Bousquet, Joint kernel maps, in Proc. of the 8th International Work-Conference on Artificial Neural Networks (Computational Intelligence and Bioinspired System), J. Cabestany, A. Prieto, and F. Sandoval, Eds., Vol. LNCS3512, 2005, pp. 176-191.
P. Geurts, L. Wehenkel, and F. d'Alché-Buc,Gradient boosting for kernelized output spaces,in ACM International Conference Proceeding Series (Proceedings of the 24th International Conference on Machine Learning), Vol. 227, 2007, pp. 289-296. (Pubitemid 47275077)