[en] In the context of multistage stochastic optimization problems, we propose a hybrid strategy for generalizing to nonlinear decision rules, using machine learning, a finite data set of constrained vector-valued recourse decisions optimized using scenario-tree techniques from multistage stochastic programming. The decision rules are based on a statistical model inferred from a given scenario-tree solution and are selected by out-of-sample simulation given the true problem. Because the learned rules depend on the given scenario tree, we repeat the procedure for a large number of randomly generated scenario trees and then select the best solution (policy) found for the true problem. The scheme leads to an ex post selection of the scenario tree itself. Numerical tests evaluate the dependence of the approach on the machine learning aspects and show cases where one can obtain near-optimal solutions, starting with a “weak” scenario-tree generator that randomizes the branching structure of the trees.
Disciplines :
Computer science
Author, co-author :
Defourny, Boris; Princeton University > Department of Operations Research and Financial Engineering
Ernst, Damien ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Smart grids
Wehenkel, Louis ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Language :
English
Title :
Scenario Trees and Policy Selection for Multistage Stochastic Programming Using Machine Learning
Abbeel P, Ng A (2004) Apprenticeship learning via inverse reinforcement learning. Proc. 21st Internat. Conf. Machine Learn. 4ICML 20045, (ACM, New York), 1-8. (Pubitemid 40290785)
Birge JR (1997) State-of-the-art-survey - Stochastic programming: Computation and applications. INFORMS J. Comp. 9(2):111-133. (Pubitemid 127697768)
Birge JR, Louveaux F (1997) Introduction to Stochastic Programming (Springer, New York).
Busoniu L, Babuska R, De Schutter B, Ernst D (2010) Reinforcement Learning and Dynamic Programming Using Function Approximators (CRC Press, Boca Raton, FL).
Chiralaksanakul A (2003) Monte Carlo methods for multi-stage stochastic programs. Ph.D. thesis, University of Texas at Austin, Austin.
Coates A, Abbeel P, Ng A (2008) Learning for control from multiple demonstrations. Proc. 25th Internat. Conf. on Machine Learn. 4ICML 20085 (ACM, New York), 144-151.
Defourny B (2010) Machine learning solution methods for multistage stochastic programming. Ph.D. thesis, University of Liège, Liège, Belgium.
Defourny B, Ernst D, Wehenkel L (2009) Bounds for multistage stochastic programs using supervised learning strategies. Proc. 5th internat. Conf. Stochastic Algorithms: Foundations and Applications 4SAGA 20095, Lecture Notes in Computer Science, Vol. 5792. (Springer-Verlag, Berlin), 61-73.
Defourny B, Ernst D, Wehenkel L (2012) Multistage stochastic programming: A scenario tree based approach to planning under uncertainty. Morales EF, Sucar LE, Hoey J, eds. Decision Theory Models for Applications in Artificial Intelligence: Concepts and Solutions (IGI Global, Hershey, PA), 97-143.
Dupacova J, Consigli G, Wallace SW (2000) Scenarios for multistage stochastic programs. Ann. Oper. Res. 100(1-4):25-53.
Frauendorfer K (1996) Barycentric scenario trees in convex multistage stochastic programming. Math. Programming 75(2):277-294.
Grant M, Boyd S (2008) Graph implementations for nonsmooth convex programs. Blondel V, Boyd S, Kimura H, eds. Recent Advances in Learning and Control - A Tribute to M. Vidyasagar, Lecture Notes in Control and Information Systems, Vol. 371 (Springer, New York), 95-110.
Grant M, Boyd S (2009) CVX: Matlab software for disciplined convex programming (Web page and software). Accessed February 28, 2009 http://cvxr.com/cvx/.
Hastie T, Tibshirani R, Friedman J (2009) The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd ed. (Springer, New York).
Heitsch H, Römisch W (2009) Scenario tree modeling for multistage stochastic programs. Math. Programming 118(2):371-406.
Heitsch H, Römisch W (2011) Stability and scenario trees for multistage stochastic programs. Infanger G, ed. Stochastic Programming - The State of the Art, In Honor of George B. Dantzig (Springer, New York), 139-164.
Hilli P, Pennanen T (2008) Numerical study of discretizations of multistage stochastic programs. Kybernetika 44(2):185-204.
Høyland K, Wallace SW (2001) Generating scenario trees for multistage decision problems. Management Sci. 47(2):295-307. (Pubitemid 34192765)
Huang K, Ahmed S (2009) The value of multistage stochastic programming in capacity planning under uncertainty. Oper. Res. 57(4):893-904.
Kallrath J, Pardalos PM, Rebennack S, Scheidt M, eds. (2009) Optimization in the Energy Industry (Springer-Verlag, Berlin).
Kouwenberg R (2001) Scenario generation and stochastic programming models for asset liability management. Eur. J. Oper. Res. 134(2):279-292. (Pubitemid 32723621)
Küchler C, Vigerske S (2010) Numerical evaluation of approximation methods in stochastic programming. Optimization 59(3):401-415.
Mak W-K, Morton DP, Wood RK (1999) Monte Carlo bounding techniques for determining solution quality in stochastic programs. Oper. Res. Let. 24(1-2):47-56.
Mulvey JM, Kim WC (2011) Multistage financial planning models: Integrating stochastic programs and policy simulators. Infanger G, ed. Stochastic Programming - The State of the Art, In Honor of George B. Dantzig (Springer, New York), 257-276.
Nesterov Y, Vial J-P (2008) Confidence level solutions for stochastic programming. Automatica 44(6):1559-1568.
O'Hagan A (1978) Curve fitting and optimal design for prediction. J. Roy. Statist. Soc. 40(1):1-42.
Pages G, Printems J (2003) Optimal quadratic quantization for numerics: The Gaussian case. Monte Carlo Methods Appl. 9(2):135-166.
Pennanen T (2005) Epi-convergent discretizations of multistage stochastic programs. Math. Oper. Res. 30(1):245-256. (Pubitemid 43126858)
Pennanen T (2009) Epi-convergent discretizations of multistage stochastic programs via integration quadratures. Math. Programming 116(1):461-479.
Peters J, Schaal S (2008) Natural actor-critic. Neurocomputing 71(7-9):1180-1190.
Powell WB (2011) Approximate Dynamic Programming: Solving the Curses of Dimensionality, 2nd ed. (Wiley, Hoboken, NJ).
Rasmussen CE, Williams CKI (2006) Gaussian Processes for Machine Learning (MIT Press, Cambridge, MA).
Rockafellar RT, Wets RJ-B (1991) Scenarios and policy aggregation in optimization under uncertainty. Math. Oper. Res. 16(1):119-147.
Shapiro A (2003) Inference of statistical bounds for multistage stochastic programming problems. Math. Methods Oper. Res. 58(1):57-68. (Pubitemid 40871365)
Shapiro A, Dentcheva D, Ruszczýnski A (2009) Lectures on Stochastic Programming: Modeling and Theory (MPS-SIAM Series on Optimization, SIAM, Philadelphia).
Sutton RS, Barto AG (1998) Reinforcement Learning, an Introduction (MIT Press, Cambridge, MA).
Syed U, Bowling M, Schapire RE (2008) Apprenticeship learning using linear programming. Proc. 25th Internat. Conf. Machine Learn. 4ICML 20085 (Omni Press, Madison, WI), 1032-1039.
Szepesvári C (2010) Algorithms for Reinforcement Learning (Morgan & Claypool Publishers).
Thénié J, Vial J-P (2008) Step decision rules for multistage stochastic programming: A heuristic approach. Automatica 44(6):1569-1584.
Wallace SW, Ziemba WT, eds. (2005) Applications of Stochastic Programming (MPS-SIAM Series on Optimization, SIAM, Philadelphia).