Multistage stochastic programming: A scenario tree based approach to planning under uncertainty

Defourny, Boris; Ernst, Damien; Wehenkel, Louis

doi:10.4018/978-1-60960-165-2.ch006

Download

Contribution to collective works (Parts of books)

Multistage stochastic programming: A scenario tree based approach to planning under uncertainty

Defourny, Boris; Ernst, Damien; Wehenkel, Louis

2011 • In Sucar, L. Enrique; Morales, Eduardo F.; Hoey, Jesse (Eds.) Decision Theory Models for Applications in Artificial Intelligence: Concepts and Solutions

Peer reviewed

Permalink
https://hdl.handle.net/2268/80246

DOI
10.4018/978-1-60960-165-2.ch006

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

MSPchap_preprint.pdf

Author preprint (458.52 kB)

Download

All documents in ORBi are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Sequential decision making under uncertainty; scenario tree generation; validation of approximate solutions

Abstract :

[en] In this chapter, we present the multistage stochastic programming framework for sequential decision making under uncertainty. We discuss its differences with Markov Decision Processes, from the point of view of decision models and solution algorithms. We describe the standard technique for solving approximately multistage stochastic problems, which is based on a discretization of the disturbance space called scenario tree. We insist on a critical issue of the approach: the decisions can be very sensitive to the parameters of the scenario tree, whereas no efficient tool for checking the quality of approximate solutions exists. In this chapter, we show how supervised learning techniques can be used to evaluate reliably the quality of an approximation, and thus facilitate the selection of a good scenario tree. The framework and solution techniques presented in the chapter are explained and detailed on several examples. Along the way, we define notions from decision theory that can be used to quantify, for a particular problem, the advantage of switching to a more sophisticated decision model.

Research Center/Unit :

Systems and Modeling Research Unit

Disciplines :

Computer science

Author, co-author :

Defourny, Boris ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Dép. d'électric., électron. et informat. (Inst.Montefiore)

Ernst, Damien ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation

Wehenkel, Louis ; Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation

Language :

English

Title :

Multistage stochastic programming: A scenario tree based approach to planning under uncertainty

Publication date :

2011

Main work title :

Decision Theory Models for Applications in Artificial Intelligence: Concepts and Solutions

Editor :

Sucar, L. Enrique

Morales, Eduardo F.

Hoey, Jesse

Publisher :

Information Science Publishing, Hershey, United States - Pennsylvania

ISBN/EAN :

978-1609601652

Peer reviewed :

Peer reviewed

Funders :

F.R.S.-FNRS - Fonds de la Recherche Scientifique
BELSPO - Politique scientifique fédérale

Funding text :

This paper presents research results of the Belgian Network DYSCO (Dynamical Systems, Control, and Optimization), funded by the Interuniversity Attraction Poles Programme, initiated by the Belgian State, Science Policy Office. The scientific responsibility rests with its authors. Damien Ernst is a Research Associate of the Belgian FRS-FNRS of which he acknowledges the financial support. This work was supported in part by the IST Programme on the European Community, under the PASCAL2 Network of Excellence, IST-2007-216886.

Available on ORBi :

since 26 December 2010

Statistics

Number of views

725 (61 by ULiège)

Number of downloads

2089 (30 by ULiège)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenCitations

OpenAlex citations

See more details

publications

supporting

mentioning

contrasting

Smart Citations

Citing PublicationsSupportingMentioningContrasting

View Citations

See how this article has been cited at scite.ai

scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

Bibliography

Antos, A., Munos, R., & Szepesvári, C. (2008). Fitted Q-iteration in continuous action-space MDPs. In Advances in Neural Information Processing Systems 20 (NIPS-2007) (pp. 9-16). Cambridge, MA: MIT Press.
Applegate, D. L., Bixby, R. E., Chvátal, V., & Cook, W. J. (2007). The traveling salesman problem: A computational study. Princeton, NJ: Princeton University Press.
Artzner, P., Delbaen, F., Eber, J.-M., Heath, D., & Ku, H. (2007). Coherent multiperiod risk adjusted values and Bellman's principle. Annals of Operations Research, 152(1), 5-22. doi:10.1007/s10479-006-0132-6
Bagnell, D., Kakade, S., Ng, A. Y., & Schneider, J. (2004). Policy search by dynamic programming. In Advances in Neural Information Processing Systems 16 (NIPS-2003) (pp. 831-838). Cambridge, MA: MIT Press.
Balasubramanian, J., & Grossmann, I. E. (2003). Approximation to multistage stochastic optimization in multiperiod batch plant scheduling under demand uncertainty. Industrial & Engineering Chemistry Research, 43, 3695-3713. doi:10.1021/ie030308+
Bellman, R. (1962). Dynamic programming treatment of the travelling salesman problem. Journal of the ACM, 9, 61-63. doi:10.1145/321105.321111
Ben-Tal, A., El Ghaoui, L., & Nemirovski, A. (2009). Robust optimization. Princeton, NJ: Princeton University Press.
Bertsekas, D. P. (2005). Dynamic programming and optimal control (3rd ed.). Belmont, MA: Athena Scientific.
Billingsley, P. (1995). Probability and measure (3rd ed.). New York, NY: Wiley-Interscience.
Birge, J. R. (1992). The value of the stochastic solution in stochastic linear programs with fixed recourse. Mathematical Programming, 24, 314-325. doi:10.1007/BF01585113
Birge, J. R., & Louveaux, F. (1997). Introduction to stochastic programming. New York, NY: Springer.
Boda, K., & Filar, J. A. (2006). Time consistent dynamic risk measures. Mathematical Methods of Operations Research, 63, 169-186. doi:10.1007/s00186-005-0045-1
Boyd, S., & Vandenberghe, L. (2004). Convex optimization. Cambridge, UK: Cambridge University Press.
Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. Boca Raton, FL: Chapman and Hall/CRC.
Carpentier, P., Cohen, G., & Culioli, J. C. (1996). Stochastic optimization of unit commitment: A new decomposition framework. IEEE Transactions on Power Systems, 11, 1067-1073. doi:10.1109/59.496196
Chiralaksanakul, A. (2003). Monte Carlo methods for multi-stage stochastic programs. Unpublished doctoral dissertation, University of Texas at Austin, Austin, TX.
Chung, K.-J., & Sobel, M. (1987). Discounted MDP's: Distribution functions and exponential utility maximization. SIAM Journal on Control and Optimization, 25(1), 49-62. doi:10.1137/0325004
Coquelin, P.-A., Deguest, R., & Munos, R. (2009). Particle filter-based policy gradient in POMDPs. In Advances in Neural Information Processing Systems 21 (NIPS-2008) (pp. 337-344). Cambridge, MA: MIT Press.
Csáji, B., & Monostori, L. (2008). Value function based reinforcement learning in changing Markovian environments. Journal of Machine Learning Research, 9, 1679-1709.
Dantzig, G. B. (1955). Linear programming under uncertainty. Management Science, 1, 197-206. doi:10.1287/mnsc.1.3-4.197
Defourny, B., Ernst, D., & Wehenkel, L. (2008, December). Risk-aware decision making and dynamic programming. Paper presented at the NIPS-08 Workshop on Model Uncertainty and Risk in Reinforcement Learning, Whistler, BC.
Defourny, B., Ernst, D., & Wehenkel, L. (2009). Bounds for multistage stochastic programs using supervised learning strategies. In Stochastic Algorithms: Foundations and Applications. Fifth International Symposium, SAGA 2009 (pp. 61-73). Berlin, Germany: Springer-Verlag.
Delage, E., & Ye, Y. (2010). Distributionally robust optimization under moment uncertainty with application to data-driven problems. Operations Research, 58(3), 596-612. doi:10.1287/opre.1090.0741
Dempster, M. A. H., Pflug, G., & Mitra, G. (Eds.). (2008). Quantitative fund management. Boca Raton, FL: Chapman & Hall/CRC.
Demuth, H., & Beale, M. (1993). Neural network toolbox for use with Matlab.
Dentcheva, D., & Römisch, W. (2004). Duality gaps in nonconvex stochastic optimization. Mathematical Programming, 101(3), 515-535. doi:10.1007/s10107-003-0496-1
Dietterich, T. G. (2000). Ensemble methods in machine learning. In Proceedings of the First International Workshop on Multiple Classifier Systems (pp. 1-15). Berlin, Germany: Springer-Verlag.
Dupacova, J. (1987). The minimax approach to stochastic programming and an illustrative application. Stochastics, 20, 73-88. doi:10.1080/17442508708833436
Epstein, L., & Schneider, M. (2003). Recursive multiple-priors. Journal of Economic Theory, 113, 1-13. doi:10.1016/S0022-0531(03)00097-8
Ernst, D., Geurts, P., & Wehenkel, L. (2005). Tree-based batch mode reinforcement learning. Journal of Machine Learning Research, 6, 503-556.
Ernst, D., Glavic, M., Capitanescu, F., & Wehenkel, L. (2009). Reinforcement learning versus model predictive control: A comparison on a power system problem. IEEE Transactions on Systems, Man, and Cybernetics. Part B, Cybernetics, 39(2), 517-529. doi:10.1109/TSMCB.2008.2007630
Escudero, L. F. (2009). On a mixture of the fix-and-relax coordination and Lagrangian substitution schemes for multistage stochastic mixed integer programming. Top (Madrid), 17, 5-29. doi:10.1007/s11750-009-0090-7
Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely randomized trees. Machine Learning, 63, 3-42. doi:10.1007/s10994-006-6226-1
Ghavamzadeh, M., & Engel, Y. (2007). Bayesian policy gradient algorithms. In Advances in Neural Information Processing Systems 19 (NIPS-2006) (pp. 457-464). Cambridge, MA: MIT Press.
Goel, V., & Grossmann, I. E. (2006). A class of stochastic programs with decision dependent uncertainty. Mathematical Programming, 108, 355-394. doi:10.1007/s10107-006-0715-7
Grant, M., & Boyd, S. (2008). Graph implementations for nonsmooth convex programs. In Blondel, V., Boyd, S., & Kimura, H. (Eds.), Recent advances in learning and control. Lecture Notes in Control and Information Sciences (Vol. 371, pp. 95-110). London, UK: Springer-Verlag.
Grant, M., & Boyd, S. (2009, February). CVX: Matlab software for disciplined convex programming (web page and software). Retrieved from http://stanford.edu/~boyd/cvx
Hammond, P. J. (1976). Changing tastes and coherent dynamic choice. The Review of Economic Studies, 43, 159-173. doi:10.2307/2296609
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). New York, NY: Springer.
Heitsch, H., & Römisch, W. (2003). Scenario reduction algorithms in stochastic programming. Computational Optimization and Applications, 24, 187-206. doi:10.1023/A:1021805924152
Heitsch, H., & Römisch, W. (2009). Scenario tree modeling for multistage stochastic programs. Mathematical Programming, 118(2), 371-406. doi:10.1007/s10107-007-0197-2
Hilli, P., & Pennanen, T. (2008). Numerical study of discretizations of multistage stochastic programs. Kybernetika, 44, 185-204.
Hinton, G. E., Osindero, S., & Teh, Y.-W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18, 1527-1554. doi:10.1162/neco.2006.18.7.1527
Hochreiter, R., & Pflug, G. C. (2007). Financial scenario generation for stochastic multi-stage decision processes as facility location problems. Annals of Operations Research, 152, 257-272. doi:10.1007/s10479-006-0140-6
Hoffman, M., Kueck, H., Doucet, A., & de Freitas, N. (2009). New inference strategies for solving Markov decision processes using reversible jump MCMC. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI-2009) (pp. 223-231). AUAI Press.
Howard, R. A., & Matheson, J. (1972). Risk-sensitive Markov decision processes. Management Science, 18(7), 356-369. doi:10.1287/mnsc.18.7.356
Høyland, K., Kaut, M., & Wallace, S. W. (2003). A heuristic for moment matching scenario generation. Computational Optimization and Applications, 24, 169-185. doi:10.1023/A:1021853807313
Huang, K., & Ahmed, S. (2009). The value of multistage stochastic programming in capacity planning under uncertainty. Operations Research, 57, 893-904. doi:10.1287/opre.1080.0623
Infanger, G. (1992). Monte Carlo (importance) sampling within a Benders decomposition algorithm for stochastic linear programs. Annals of Operations Research, 39, 69-95. doi:10.1007/BF02060936
Kallrath, J., Pardalos, P. M., Rebennack, S., & Scheidt, M. (Eds.). (2009). Optimization in the energy industry. Berlin, Germany: Springer-Verlag. doi:10.1007/978-3-540-88965-6
Kearns, M. J., Mansour, Y., & Ng, A. Y. (2002). A sparse sampling algorithm for near-optimal planning in large Markov Decision Processes. Machine Learning, 49(2-3), 193-208. doi:10.1023/A:1017932429737
Koivu, M., & Pennanen, T. (2010). Galerkin methods in dynamic stochastic programming. Optimization, 59, 339-354. doi:10.1080/02331931003696368
Kouwenberg, R. (2001). Scenario generation and stochastic programming models for asset liability management. European Journal of Operational Research, 134, 279-292. doi:10.1016/S0377-2217(00)00261-7
Küchler, C., & Vigerske, S. (2010). Numerical evaluation of approximation methods in stochastic programming. Optimization, 59, 401-415. doi:10.1080/02331931003700756
Kuhn, D. (2005). Generalized bounds for convex multistage stochastic programs. Lecture Notes in Economics and Mathematical Systems (Vol. 548). Berlin, Germany: Springer-Verlag.
Kydland, F. E., & Prescott, E. C. (1977). Rules rather than discretion: The inconsistency of optimal plans. The Journal of Political Economy, 85, 473-492. doi:10.1086/260580
Lagoudakis, M. G., & Parr, R. (2003). Reinforcement learning as classification: Leveraging modern classifiers. In Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003) (pp. 424-431). Menlo Park, CA: AAAI Press.
Langford, J., & Zadrozny, B. (2005). Relating reinforcement learning performance to classification performance. In Proceedings of the Twenty-Second International Conference on Machine Learning (ICML-2005) (pp. 473-480). New York, NY: ACM.
Littman, M. L., Dean, T. L., & Kaelbling, L. P. (1995). On the complexity of solving Markov decision problems. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence (UAI-1995) (pp. 394-402). San Francisco, CA: Morgan Kaufmann.
MacKay, D. J. C. (2003). Information theory, inference and learning algorithms. Cambridge, UK: Cambridge University Press.
Mak, W.-K., Morton, D. P., & Wood, R. K. (1999). Monte Carlo bounding techniques for determining solution quality in stochastic programs. Operations Research Letters, 24(1-2), 47-56. doi:10.1016/S0167-6377(98)00054-6
Mercier, L., & Van Hentenryck, P. (2007). Performance analysis of online anticipatory algorithms for large multistage stochastic integer programs. In Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI-07) (pp. 1979-1984). San Francisco, CA: Morgan Kaufmann.
Munos, R., & Szepesvári, C. (2008). Finite-time bound for fitted value iteration. Journal of Machine Learning Research, 9, 815-857.
Nemirovski, A., Juditsky, A., Lan, G., & Shapiro, A. (2009). Stochastic approximation approach to stochastic programming. SIAM Journal on Optimization, 19, 1574-1609. doi:10.1137/070704277
Nesterov, Y. (2003). Introductory lectures on convex optimization. Dordrecht, The Netherlands: Kluwer Academic Publishers.
Ng, A. Y., & Jordan, M. (1999). PEGASUS: A policy search method for large MDPs and POM-DPs. In Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI-2000) (pp. 406-415). San Francisco, CA: Morgan Kaufmann.
Norkin, V. I., Ermoliev, Y. M., & Ruszczyński, A. (1998). On optimal allocation of indivisibles under uncertainty. Operations Research, 46, 381-395. doi:10.1287/opre.46.3.381
Pages, G., & Printems, J. (2003). Optimal quadratic quantization for numerics: The Gaussian case. Monte Carlo Methods and Applications, 9, 135-166. doi:10.1515/156939603322663321
Pennanen, T. (2009). Epi-convergent discretizations of multistage stochastic programs via integration quadratures. Mathematical Programming, 116, 461-479. doi:10.1007/s10107-007-0113-9
Pflug, G. C., & Römisch, W. (2007). Modeling, measuring and managing risk. Hackensack, NJ: World Scientific Publishing Company. doi:10.1142/9789812708724
Powell, W. B. (2007). Approximate dynamic programming: Solving the curses of dimensionality. Hoboken, NJ: Wiley-Interscience. doi:10.1002/9780470182963
Powell, W. B., & Topaloglu, H. (2003). Stochastic programming in transportation and logistics. In Ruszczyński, A., & Shapiro, A. (Eds.), Stochastic programming. Handbooks in operations research and management science (Vol. 10, pp. 555-635). Amsterdam, The Netherlands: Elsevier.
Prékopa, A. (1995). Stochastic programming. Dordrecht, The Netherlands: Kluwer Academic Publishers.
Puterman, M. L. (1994). Markov decision processes: Discrete stochastic dynamic programming. Hoboken, NJ: Wiley.
Rachev, S. T., & Römisch, W. (2002). Quantitative stability in stochastic programming: The method of probability metrics. Mathematical Programming, 27(4), 792-818.
Raiffa, H., & Schlaifer, R. (1961). Applied statistical decision theory. Cambridge, MA: Harvard University Press.
Rasmussen, C. E., & Williams, C. K. I. (2006). Gaussian processes for machine learning. Cambridge, MA: MIT Press.
Rockafellar, R. T. (1970). Convex analysis. Princeton, NJ: Princeton University Press.
Rockafellar, R. T., & Uryasev, S. (2000). Optimization of conditional value-at-risk. Journal of Risk, 2(3), 21-41.
Rockafellar, R. T., & Wets, R. J.-B. (1991). Scenarios and policy aggregation in optimization under uncertainty. Mathematics of Operations Research, 16, 119-147. doi:10.1287/moor.16.1.119
Samuelson, P. A. (1937). A note on measurement of utility. The Review of Economic Studies, 4(2), 155-161. doi:10.2307/2967612
Schultz, R., Stougie, L., & Van der Vlerk, M. H. (1998). Solving stochastic programs with integer recourse by enumeration: A framework using Gröbner basis reduction. Mathematical Programming, 83, 229-252. doi:10.1007/BF02680560
Sen, S., Doverspike, R. D., & Cosares, S. (1994). Network planning with random demand. Telecommunication Systems, 3, 11-30. doi:10.1007/BF02110042
Sen, S., & Sherali, H. (2006). Decomposition with branch-and-cut approaches for two-stage stochastic mixed-integer programming. Mathematical Programming, 106, 203-223. doi:10.1007/s10107-005-0592-5
Sen, S., Yu, L., & Genc, T. (2006). A stochastic programming approach to power portfolio optimization. Operations Research, 54, 55-72. doi:10.1287/opre.1050.0264
Shapiro, A. (2003a). Inference of statistical bounds for multistage stochastic programming problems. Mathematical Methods of Operations Research, 58(1), 57-68. doi:10.1007/s001860300280
Shapiro, A. (2003b). Monte Carlo sampling methods. In Ruszczyński, A., & Shapiro, A. (Eds.), Stochastic programming. Handbooks in operations research and management science (Vol. 10, pp. 353-425). Amsterdam, The Netherlands: Elsevier.
Shapiro, A. (2006). On complexity of multistage stochastic programs. Operations Research Letters, 34(1), 1-8. doi:10.1016/j.orl.2005.02.003
Shapiro, A. (2009). On a time-consistency concept in risk averse multistage stochastic programming. Operations Research Letters, 37, 143-147. doi:10.1016/j.orl.2009.02.005
Shapiro, A., Dentcheva, D., & Ruszczýski, A. (2009). Lectures on stochastic programming: Modeling and theory. Philadelphia, PA: SIAM. doi:10.1137/1.9780898718751
Simon, H. A. (1956). Rational choice and the structure of the environment. Psychological Review, 63, 129-138. doi:10.1037/h0042769
Singh, S. S., Kantas, N., Vo, B.-N., Doucet, A., & Evans, R. J. (2007). Simulation-based optimal sensor scheduling with application to observer trajectory planning. Automatica, 43, 817-830. doi:10.1016/j.automatica.2006.11.019
Steinwart, I., & Christman, A. (2008). Support vector machines. New York, NY: Springer.
Strotz, R. H. (1955). Myopia and inconsistency in dynamic utility maximization. The Review of Economic Studies, 23, 165-180. doi:10.2307/2295722
Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning, an introduction. Cambridge, MA: MIT Press.
The MathWorks, Inc. (2004). Matlab. Retrieved from http://www.mathworks.com
Van der Vlerk, M. H. (2010). Convex approximations for a class of mixed-integer recourse models. Annals of Operations Research, 177, 139-151. doi:10.1007/s10479-009-0591-7
Van Hentenryck, P., & Bent, R. (2006). Online stochastic combinatorial optimization. Cambridge, MA: MIT Press.
Vapnik, V. N. (1998). Statistical learning theory. New York, NY: John Wiley & Sons.
Verweij, B., Ahmed, S., Kleywegt, A., Nemhauser, G., & Shapiro, A. (2003). The sample average approximation method applied to stochastic routing problems: A computational study. Computational Optimization and Applications, 24(2-3), 289-333. doi:10.1023/A:1021814225969
Wallace, S. W., & Ziemba, W. T. (Eds.). (2005). Applications of stochastic programming. Philadelphia, PA: SIAM. doi:10.1137/1.9780898718799
Wets, R. J.-B. (1974). Stochastic programs with fixed recourse: The equivalent deterministic program. SIAM Review, 16, 309-339. doi:10.1137/1016053

Similar publications

Sorry the service is unavailable at the moment. Please try again later.

Name	Provider / Domaine	Expiration	Description
JSESSIONID	Oracle Corporation www.uliege.be	Session	General purpose platform session cookie, used by sites written in JSP. Usually used to maintain an anonymous user session by the server.
CookieScriptConsent	CookieScript .uliege.be	1 year	This cookie is used by Cookie-Script.com service to remember visitor cookie consent preferences. It is necessary for Cookie-Script.com cookie banner to work properly.

Name	Provider / Domaine	Expiration	Description
_pk_id	InnoCraft Ltd .uliege.be	1 year	Used to store a few details about the user such as the unique visitor ID
_pk_ses	InnoCraft Ltd .uliege.be	30 minutes	Short lived cookies used to temporarily store data for the visit
_pk_ref	InnoCraft Ltd .uliege.be	6 months	Used to store the attribution information, the referrer initially used to visit the website