Multistage stochastic programming: A scenario tree based approach to planning under uncertainty
Defourny, Boris; Ernst, Damien; Wehenkel, Louis
2011 • In Sucar, L. Enrique; Morales, Eduardo F.; Hoey, Jesse (Eds.) Decision Theory Models for Applications in Artificial Intelligence: Concepts and Solutions
Sequential decision making under uncertainty; scenario tree generation; validation of approximate solutions
Abstract :
[en] In this chapter, we present the multistage stochastic programming framework for sequential decision making under uncertainty. We discuss its differences with Markov Decision Processes, from the point of view of decision models and solution algorithms. We describe the standard technique for solving approximately multistage stochastic problems, which is based on a discretization of the disturbance space called scenario tree. We insist on a critical issue of the approach: the decisions can be very sensitive to the parameters of the scenario tree, whereas no efficient tool for checking the quality of approximate solutions exists. In this chapter, we show how supervised learning techniques can be used to evaluate reliably the quality of an approximation, and thus facilitate the selection of a good scenario tree. The framework and solution techniques presented in the chapter are explained and detailed on several examples. Along the way, we define notions from decision theory that can be used to quantify, for a particular problem, the advantage of switching to a more sophisticated decision model.
Research Center/Unit :
Systems and Modeling Research Unit
Disciplines :
Computer science
Author, co-author :
Defourny, Boris ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Dép. d'électric., électron. et informat. (Inst.Montefiore)
Ernst, Damien ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Wehenkel, Louis ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation
Language :
English
Title :
Multistage stochastic programming: A scenario tree based approach to planning under uncertainty
Publication date :
2011
Main work title :
Decision Theory Models for Applications in Artificial Intelligence: Concepts and Solutions
Editor :
Sucar, L. Enrique
Morales, Eduardo F.
Hoey, Jesse
Publisher :
Information Science Publishing, Hershey, United States - Pennsylvania
ISBN/EAN :
978-1609601652
Peer reviewed :
Peer reviewed
Funders :
F.R.S.-FNRS - Fonds de la Recherche Scientifique BELSPO - Politique scientifique fédérale
Funding text :
This paper presents research results of the Belgian Network DYSCO (Dynamical Systems, Control, and Optimization), funded by the Interuniversity Attraction Poles Programme, initiated by the Belgian State, Science Policy Office. The scientific responsibility rests with its authors. Damien Ernst is a Research Associate of the Belgian FRS-FNRS of which he acknowledges the financial support. This work was supported in part by the IST Programme on the European Community, under the PASCAL2 Network of Excellence, IST-2007-216886.
Antos, A., Munos, R., & Szepesvári, C. (2008). Fitted Q-iteration in continuous action-space MDPs. In Advances in Neural Information Processing Systems 20 (NIPS-2007) (pp. 9-16). Cambridge, MA: MIT Press.
Applegate, D. L., Bixby, R. E., Chvátal, V., & Cook, W. J. (2007). The traveling salesman problem: A computational study. Princeton, NJ: Princeton University Press.
Artzner, P., Delbaen, F., Eber, J.-M., Heath, D., & Ku, H. (2007). Coherent multiperiod risk adjusted values and Bellman's principle. Annals of Operations Research, 152(1), 5-22. doi:10.1007/s10479-006-0132-6
Bagnell, D., Kakade, S., Ng, A. Y., & Schneider, J. (2004). Policy search by dynamic programming. In Advances in Neural Information Processing Systems 16 (NIPS-2003) (pp. 831-838). Cambridge, MA: MIT Press.
Balasubramanian, J., & Grossmann, I. E. (2003). Approximation to multistage stochastic optimization in multiperiod batch plant scheduling under demand uncertainty. Industrial & Engineering Chemistry Research, 43, 3695-3713. doi:10.1021/ie030308+
Bellman, R. (1962). Dynamic programming treatment of the travelling salesman problem. Journal of the ACM, 9, 61-63. doi:10.1145/321105.321111
Ben-Tal, A., El Ghaoui, L., & Nemirovski, A. (2009). Robust optimization. Princeton, NJ: Princeton University Press.
Bertsekas, D. P. (2005). Dynamic programming and optimal control (3rd ed.). Belmont, MA: Athena Scientific.
Billingsley, P. (1995). Probability and measure (3rd ed.). New York, NY: Wiley-Interscience.
Birge, J. R. (1992). The value of the stochastic solution in stochastic linear programs with fixed recourse. Mathematical Programming, 24, 314-325. doi:10.1007/BF01585113
Birge, J. R., & Louveaux, F. (1997). Introduction to stochastic programming. New York, NY: Springer.
Boda, K., & Filar, J. A. (2006). Time consistent dynamic risk measures. Mathematical Methods of Operations Research, 63, 169-186. doi:10.1007/s00186-005-0045-1
Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge, UK: Cambridge University Press.
Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. Boca Raton, FL: Chapman and Hall/CRC.
Carpentier, P., Cohen, G., & Culioli, J. C. (1996). Stochastic optimization of unit commitment: A new decomposition framework. IEEE Transactions on Power Systems, 11, 1067-1073. doi:10.1109/59.496196
Chiralaksanakul, A. (2003). Monte Carlo methods for multi-stage stochastic programs. Unpublished doctoral dissertation, University of Texas at Austin, Austin, TX.
Chung, K.-J., & Sobel, M. (1987). Discounted MDP's: Distribution functions and exponential utility maximization. SIAM Journal on Control and Optimization, 25(1), 49-62. doi:10.1137/0325004
Coquelin, P.-A., Deguest, R., & Munos, R. (2009). Particle filter-based policy gradient in POMDPs. In Advances in Neural Information Processing Systems 21 (NIPS-2008) (pp. 337-344). Cambridge, MA: MIT Press.
Csáji, B., & Monostori, L. (2008). Value function based reinforcement learning in changing Markovian environments. Journal of Machine Learning Research, 9, 1679-1709.
Dantzig, G. B. (1955). Linear programming under uncertainty. Management Science, 1, 197-206. doi:10.1287/mnsc.1.3-4.197
Defourny, B., Ernst, D., & Wehenkel, L. (2008, December). Risk-aware decision making and dynamic programming. Paper presented at the NIPS-08 Workshop on Model Uncertainty and Risk in Reinforcement Learning, Whistler, BC.
Defourny, B., Ernst, D., & Wehenkel, L. (2009). Bounds for multistage stochastic programs using supervised learning strategies. In Stochastic Algorithms: Foundations and Applications. Fifth International Symposium, SAGA 2009 (pp. 61-73). Berlin, Germany: Springer-Verlag.
Delage, E., & Ye, Y. (2010). Distributionally robust optimization under moment uncertainty with application to data-driven problems. Operations Research, 58(3), 596-612. doi:10.1287/opre.1090.0741
Dempster, M. A. H., Pflug, G., & Mitra, G. (Eds.). (2008). Quantitative fund management. Boca Raton, FL: Chapman & Hall/CRC.
Demuth, H., & Beale, M. (1993). Neural network toolbox for use with Matlab.
Dentcheva, D., & Römisch, W. (2004). Duality gaps in nonconvex stochastic optimization. Mathematical Programming, 101(3), 515-535. doi:10.1007/s10107-003-0496-1
Dietterich, T. G. (2000). Ensemble methods in machine learning. In Proceedings of the First International Workshop on Multiple Classifier Systems (pp. 1-15). Berlin, Germany: Springer-Verlag.
Dupacova, J. (1987). The minimax approach to stochastic programming and an illustrative application. Stochastics, 20, 73-88. doi:10.1080/17442508708833436
Epstein, L., & Schneider, M. (2003). Recursive multiple-priors. Journal of Economic Theory, 113, 1-13. doi:10.1016/S0022-0531(03)00097-8
Ernst, D., Geurts, P., & Wehenkel, L. (2005). Tree-based batch mode reinforcement learning. Journal of Machine Learning Research, 6, 503-556.
Ernst, D., Glavic, M., Capitanescu, F., & Wehenkel, L. (2009). Reinforcement learning versus model predictive control: A comparison on a power system problem. IEEE Transactions on Systems, Man, and Cybernetics. Part B, Cybernetics, 39(2), 517-529. doi:10.1109/TSMCB.2008.2007630
Escudero, L. F. (2009). On a mixture of the fix-and-relax coordination and Lagrangian substitution schemes for multistage stochastic mixed integer programming. Top (Madrid), 17, 5-29. doi:10.1007/s11750-009-0090-7
Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely randomized trees. Machine Learning, 63, 3-42. doi:10.1007/s10994-006-6226-1
Ghavamzadeh, M., & Engel, Y. (2007). Bayesian policy gradient algorithms. In Advances in Neural Information Processing Systems 19 (NIPS-2006) (pp. 457-464). Cambridge, MA: MIT Press.
Goel, V., & Grossmann, I. E. (2006). A class of stochastic programs with decision dependent uncertainty. Mathematical Programming, 108, 355-394. doi:10.1007/s10107-006-0715-7
Grant, M., & Boyd, S. (2008). Graph implementations for nonsmooth convex programs. In Blondel, V., Boyd, S., & Kimura, H. (Eds.), Recent advances in learning and control. Lecture Notes in Control and Information Sciences (Vol. 371, pp. 95-110). London, UK: Springer-Verlag.
Grant, M., & Boyd, S. (2009, February). CVX: Matlab software for disciplined convex programming (web page and software). Retrieved from http://stanford.edu/~boyd/cvx
Hammond, P. J. (1976). Changing tastes and coherent dynamic choice. The Review of Economic Studies, 43, 159-173. doi:10.2307/2296609
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). New York, NY: Springer.
Heitsch, H., & Römisch, W. (2003). Scenario reduction algorithms in stochastic programming. Computational Optimization and Applications, 24, 187-206. doi:10.1023/A:1021805924152
Heitsch, H., & Römisch, W. (2009). Scenario tree modeling for multistage stochastic programs. Mathematical Programming, 118(2), 371-406. doi:10.1007/s10107-007-0197-2
Hilli, P., & Pennanen, T. (2008). Numerical study of discretizations of multistage stochastic programs. Kybernetika, 44, 185-204.
Hinton, G. E., Osindero, S., & Teh, Y.-W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18, 1527-1554. doi:10.1162/neco.2006.18.7.1527
Hochreiter, R., & Pflug, G. C. (2007). Financial scenario generation for stochastic multi-stage decision processes as facility location problems. Annals of Operations Research, 152, 257-272. doi:10.1007/s10479-006-0140-6
Hoffman, M., Kueck, H., Doucet, A., & de Freitas, N. (2009). New inference strategies for solving Markov decision processes using reversible jump MCMC. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI-2009) (pp. 223-231). AUAI Press.
Howard, R. A., & Matheson, J. (1972). Risk-sensitive Markov decision processes. Management Science, 18(7), 356-369. doi:10.1287/mnsc.18.7.356
Høyland, K., Kaut, M., & Wallace, S. W. (2003). A heuristic for moment matching scenario generation. Computational Optimization and Applications, 24, 169-185. doi:10.1023/A:1021853807313
Huang, K., & Ahmed, S. (2009). The value of multistage stochastic programming in capacity planning under uncertainty. Operations Research, 57, 893-904. doi:10.1287/opre.1080.0623
Infanger, G. (1992). Monte Carlo (importance) sampling within a Benders decomposition algorithm for stochastic linear programs. Annals of Operations Research, 39, 69-95. doi:10.1007/BF02060936
Kallrath, J., Pardalos, P. M., Rebennack, S., & Scheidt, M. (Eds.). (2009). Optimization in the energy industry. Berlin, Germany: Springer-Verlag. doi:10.1007/978-3-540-88965-6
Kearns, M. J., Mansour, Y., & Ng, A. Y. (2002). A sparse sampling algorithm for near-optimal planning in large Markov Decision Processes. Machine Learning, 49(2-3), 193-208. doi:10.1023/A:1017932429737
Koivu, M., & Pennanen, T. (2010). Galerkin methods in dynamic stochastic programming. Optimization, 59, 339-354. doi:10.1080/02331931003696368
Kouwenberg, R. (2001). Scenario generation and stochastic programming models for asset liability management. European Journal of Operational Research, 134, 279-292. doi:10.1016/S0377-2217(00)00261-7
Küchler, C., & Vigerske, S. (2010). Numerical evaluation of approximation methods in stochastic programming. Optimization, 59, 401-415. doi:10.1080/02331931003700756
Kuhn, D. (2005). Generalized bounds for convex multistage stochastic programs. Lecture Notes in Economics and Mathematical Systems (Vol. 548). Berlin, Germany: Springer-Verlag.
Kydland, F. E., & Prescott, E. C. (1977). Rules rather than discretion: The inconsistency of optimal plans. The Journal of Political Economy, 85, 473-492. doi:10.1086/260580
Lagoudakis, M. G., & Parr, R. (2003). Reinforcement learning as classification: Leveraging modern classifiers. In Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003) (pp. 424-431). Menlo Park, CA: AAAI Press.
Langford, J., & Zadrozny, B. (2005). Relating reinforcement learning performance to classification performance. In Proceedings of the Twenty-Second International Conference on Machine Learning (ICML-2005) (pp. 473-480). New York, NY: ACM.
Littman, M. L., Dean, T. L., & Kaelbling, L. P. (1995). On the complexity of solving Markov decision problems. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (UAI-1995) (pp. 394-402). San Francisco, CA: Morgan Kaufmann.
MacKay, D. J. C. (2003). Information theory, inference and learning algorithms. Cambridge, UK: Cambridge University Press.
Mak, W.-K., Morton, D. P., & Wood, R. K. (1999). Monte Carlo bounding techniques for determining solution quality in stochastic programs. Operations Research Letters, 24(1-2), 47-56. doi:10.1016/S0167-6377(98)00054-6
Mercier, L., & Van Hentenryck, P. (2007). Performance analysis of online anticipatory algorithms for large multistage stochastic integer programs. In Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI-07) (pp. 1979-1984). San Francisco, CA: Morgan Kaufmann.
Munos, R., & Szepesvári, C. (2008). Finite-time bound for fitted value iteration. Journal of Machine Learning Research, 9, 815-857.
Nemirovski, A., Juditsky, A., Lan, G., & Shapiro, A. (2009). Stochastic approximation approach to stochastic programming. SIAM Journal on Optimization, 19, 1574-1609. doi:10.1137/070704277
Nesterov, Y. (2003). Introductory lectures on convex optimization. Dordrecht, The Netherlands: Kluwer Academic Publishers.
Ng, A. Y., & Jordan, M. (1999). PEGASUS: A policy search method for large MDPs and POM-DPs. In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI-2000) (pp. 406-415). San Francisco, CA: Morgan Kaufmann.
Norkin, V. I., Ermoliev, Y. M., & Ruszczyński, A. (1998). On optimal allocation of indivisibles under uncertainty. Operations Research, 46, 381-395. doi:10.1287/opre.46.3.381
Pages, G., & Printems, J. (2003). Optimal quadratic quantization for numerics: The Gaussian case. Monte Carlo Methods and Applications, 9, 135-166. doi:10.1515/156939603322663321
Pennanen, T. (2009). Epi-convergent discretizations of multistage stochastic programs via integration quadratures. Mathematical Programming, 116, 461-479. doi:10.1007/s10107-007-0113-9
Pflug, G. C., & Römisch, W. (2007). Modeling, measuring and managing risk. Hackensack, NJ: World Scientific Publishing Company. doi:10.1142/9789812708724
Powell, W. B. (2007). Approximate dynamic programming: Solving the curses of dimensionality. Hoboken, NJ: Wiley-Interscience. doi:10.1002/9780470182963
Powell, W. B., & Topaloglu, H. (2003). Stochastic programming in transportation and logistics. In Ruszczyński, A., & Shapiro, A. (Eds.), Stochastic programming. Handbooks in operations research and management science (Vol. 10, pp. 555-635). Amsterdam, The Netherlands: Elsevier.
Prékopa, A. (1995). Stochastic programming. Dordrecht, The Netherlands: Kluwer Academic Publishers.
Puterman, M. L. (1994). Markov decision processes: Discrete stochastic dynamic programming. Hoboken, NJ: Wiley.
Rachev, S. T., & Römisch, W. (2002). Quantitative stability in stochastic programming: The method of probability metrics. Mathematical Programming, 27(4), 792-818.
Raiffa, H., & Schlaifer, R. (1961). Applied statistical decision theory. Cambridge, MA: Harvard University Press.
Rasmussen, C. E., & Williams, C. K. I. (2006). Gaussian processes for machine learning. Cambridge, MA: MIT Press.
Rockafellar, R. T. (1970). Convex analysis. Princeton, NJ: Princeton University Press.
Rockafellar, R. T., & Uryasev, S. (2000). Optimization of conditional value-at-risk. Journal of Risk, 2(3), 21-41.
Rockafellar, R. T., & Wets, R. J.-B. (1991). Scenarios and policy aggregation in optimization under uncertainty. Mathematics of Operations Research, 16, 119-147. doi:10.1287/moor.16.1.119
Samuelson, P. A. (1937). A note on measurement of utility. The Review of Economic Studies, 4(2), 155-161. doi:10.2307/2967612
Schultz, R., Stougie, L., & Van der Vlerk, M. H. (1998). Solving stochastic programs with integer recourse by enumeration: A framework using Gröbner basis reduction. Mathematical Programming, 83, 229-252. doi:10.1007/BF02680560
Sen, S., Doverspike, R. D., & Cosares, S. (1994). Network planning with random demand. Telecommunication Systems, 3, 11-30. doi:10.1007/BF02110042
Sen, S., & Sherali, H. (2006). Decomposition with branch-and-cut approaches for two-stage stochastic mixed-integer programming. Mathematical Programming, 106, 203-223. doi:10.1007/s10107-005-0592-5
Sen, S., Yu, L., & Genc, T. (2006). A stochastic programming approach to power portfolio optimization. Operations Research, 54, 55-72. doi:10.1287/opre.1050.0264
Shapiro, A. (2003a). Inference of statistical bounds for multistage stochastic programming problems. Mathematical Methods of Operations Research, 58(1), 57-68. doi:10.1007/s001860300280
Shapiro, A. (2003b). Monte Carlo sampling methods. In Ruszczyński, A., & Shapiro, A. (Eds.), Stochastic programming. Handbooks in operations research and management science (Vol. 10, pp. 353-425). Amsterdam, The Netherlands: Elsevier.
Shapiro, A. (2006). On complexity of multistage stochastic programs. Operations Research Letters, 34(1), 1-8. doi:10.1016/j.orl.2005.02.003
Shapiro, A. (2009). On a time-consistency concept in risk averse multistage stochastic programming. Operations Research Letters, 37, 143-147. doi:10.1016/j.orl.2009.02.005
Shapiro, A., Dentcheva, D., & Ruszczýski, A. (2009). Lectures on stochastic programming: Modeling and theory. Philadelphia, PA: SIAM. doi:10.1137/1.9780898718751
Simon, H. A. (1956). Rational choice and the structure of the environment. Psychological Review, 63, 129-138. doi:10.1037/h0042769
Singh, S. S., Kantas, N., Vo, B.-N., Doucet, A., & Evans, R. J. (2007). Simulation-based optimal sensor scheduling with application to observer trajectory planning. Automatica, 43, 817-830. doi:10.1016/j.automatica.2006.11.019
Steinwart, I., & Christman, A. (2008). Support vector machines. New York, NY: Springer.
Strotz, R. H. (1955). Myopia and inconsistency in dynamic utility maximization. The Review of Economic Studies, 23, 165-180. doi:10.2307/2295722
Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning, an introduction. Cambridge, MA: MIT Press.
The MathWorks, Inc. (2004). Matlab. Retrieved from http://www.mathworks.com
Van der Vlerk, M. H. (2010). Convex approximations for a class of mixed-integer recourse models. Annals of Operations Research, 177, 139-151. doi:10.1007/s10479-009-0591-7
Van Hentenryck, P., & Bent, R. (2006). Online stochastic combinatorial optimization. Cambridge, MA: MIT Press.
Vapnik, V. N. (1998). Statistical learning theory. New York, NY: John Wiley & Sons.
Verweij, B., Ahmed, S., Kleywegt, A., Nemhauser, G., & Shapiro, A. (2003). The sample average approximation method applied to stochastic routing problems: A computational study. Computational Optimization and Applications, 24(2-3), 289-333. doi:10.1023/A:1021814225969
Wallace, S. W., & Ziemba, W. T. (Eds.). (2005). Applications of stochastic programming. Philadelphia, PA: SIAM. doi:10.1137/1.9780898718799
Wets, R. J.-B. (1974). Stochastic programs with fixed recourse: The equivalent deterministic program. SIAM Review, 16, 309-339. doi:10.1137/1016053