Abstract
We introduce Stochastic Dynamic Cutting Plane (StoDCuP), an extension of the Stochastic Dual Dynamic Programming (SDDP) algorithm to solve multistage stochastic convex optimization problems. At each iteration, the algorithm builds lower bounding affine functions not only for the cost-to-go functions, as SDDP does, but also for some or all nonlinear cost and constraint functions. We show the almost sure convergence of StoDCuP. We also introduce an inexact variant of StoDCuP where all subproblems are solved approximately (with bounded errors) and show the almost sure convergence of this variant for vanishing errors. Finally, numerical experiments are presented on nondifferentiable multistage stochastic programs where Inexact StoDCuP computes a good approximate policy quicker than StoDCuP while SDDP and the previous inexact variant of SDDP combined with Mosek library to solve subproblems were not able to solve the differentiable reformulation of the problem.
Similar content being viewed by others
Data Availability Statement
All (simulated) data generated or analyzed during this study can be obtained following the steps given in Sect. 5 of this article.
Change history
16 April 2021
Under the section ‘4 Inexact Cuts in StoDCuP’, part b), the missing term ‘denote the’ has been reinserted so it reads ‘...denote the corresponding linearizations’.
Notes
The same notation \(\xi _{\mathtt{Index}}\) is used to denote the realization of the process at node Index of the scenario tree and the value of the process \((\xi _t)\) for stage Index. The context will allow us to know which concept is being referred to. In particular, letters n and m will only be used to refer to nodes while t will be used to refer to stages.
We checked that the instances generate nontrivial nondifferentiable problems in the sense that no function in the max dominates the other on the set \({\mathcal {X}}_t:=\{x_t \in {\mathbb {R}}^n: -100\,{\mathbf{e }} \le x_t \le 100\,{\mathbf{e }}\}\).
We also implemented ISDDP using the inexact cuts from Section 2 of [11] and such variant could not solve our instances neither, again because Mosek failed to solve all quadratic subproblems of the corresponding ISDDP.
The tests were run in file TestStoDCuP.m and the functions implementing StoDCup and IStoDCuP are inexact_stodcup_quadratic.m and inexact_stodcup_quadratic_cut_ selection.m, this latter being a variant with cut selection, denoted IStoDCuP CS in this section.
References
Andersen, E.D., Andersen, K.D.: The MOSEK optimization toolbox for MATLAB manual. Version 9.2, 2019. https://www.mosek.com/documentation/
Benders, J.F.: Partitioning procedures for solving mixed-variables programming problems. Numer. Math. 4, 238–252 (1962)
Birge, J.R.: Decomposition and partitioning methods for multistage stochastic linear programs. Oper. Res. 33, 989–1007 (1985)
Ding, L., Ahmed, S., Shapiro, A.: A python package for multi-stage stochastic programming. Optimization Online (2019)
Girardeau, P., Leclere, V., Philpott, A.B.: On the convergence of decomposition methods for multistage stochastic convex programs. Math. Oper. Res. 40, 130–145 (2015)
Guigues, V.: Convergence analysis of sampling-based decomposition methods for risk-averse multistage stochastic convex programs. SIAM J. Optim. 26, 2468–2494 (2016)
Guigues, V.: Dual dynamic programing with cut selection: convergence proof and numerical experiments. Eur. J. Oper. Res. 258, 47–57 (2017)
Guigues, V.: Inexact cuts in stochastic dual dynamic programming. SIAM J. Optim. 30, 407–438 (2020)
Guigues, V.: Inexact stochastic mirror descent for two-stage nonlinear stochastic programs. Accepted for publication in Mathematical Programming (2020). https://arxiv.org/pdf/1805.11732.pdf
Guigues, V., Bandarra, M.: Single cut and multicut SDDP with cut selection for multistage stochastic linear programs: convergence proof and numerical experiments. Computational Management Science. https://arxiv.org/abs/1902.06757
Guigues, V., Monteiro, R., Svaiter, B.: Inexact cuts in SDDP applied to multistage stochastic nondifferentiable problems. ar**v (2020). https://arxiv.org/abs/2004.02701
Guigues, V., Römisch, W.: Sampling-based decomposition methods for multistage stochastic programs based on extended polyhedral risk measures. SIAM J. Optim. 22, 286–312 (2012)
Guigues, V., Römisch, W.: SDDP for multistage stochastic linear programs based on spectral risk measures. Oper. Res. Lett. 40, 313–318 (2012)
Guigues, V., Shapiro, A., Cheng, Y.: Duality and sensitivity analysis of multistage linear stochastic programs. Optimization OnLine (2019)
Guigues, V., Tekaya, W., Lejeune, M.: Regularized decomposition methods for deterministic and stochastic convex optimization and application to portfolio selection with direct transaction and market impact costs. Optim. Eng. 21, 1133–1165 (2020)
Kelley, J.E.: The cutting plane method for solving convex programs. J. SIAM 8, 703–712 (1960)
Kiwiel, K.C.: An aggregate subgradient method for nonsmooth convex minimization. Math. Program. 27, 320–341 (1983)
Kiwiel, K.C.: Proximity control in bundle methods for convex nondifferentiable minimization. Math. Program. 46, 105–122 (1990)
Kozmík, V., Morton, D.P.: Evaluating policies in risk-averse multi-stage stochastic programming. Math. Program. 152, 275–300 (2015)
Lemaréchal, C.: An extension of Davidon methods to non-differentiable problems. Math. Program. Study 3, 95–109 (1975)
Lemaréchal, C., Nemirovski, A., Nesterov, Y.: New variants of bundle methods. Math. Program. 69, 111–147 (1995)
Liu, R.P., Shapiro, A.: Risk neutral reformulation approach to risk averse stochastc programming. ar**v (2018). ar**v:1901.01302
Pereira, M.V.F., Pinto, L.M.V.G.: Multi-stage stochastic optimization applied to energy planning. Math. Program. 52, 359–375 (1991)
Philpott, A., de Matos, V.: Dynamic sampling algorithms for multi-stage stochastic programs with risk aversion. Eur. J. Oper. Res. 218, 470–483 (2012)
Philpott, A., de Matos, V., Finardi, E.: Improving the performance of stochastic dual dynamic programming. J. Comput. Appl. Math. 290, 196–208 (2012)
Philpott, A.B., Guan, Z.: On the convergence of stochastic dual dynamic programming and related methods. Oper. Res. Lett. 36, 450–455 (2008)
Powell, W.P.: Approximate Dynamic Programming, 2nd edn. Wiley (2011)
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
Rockafellar, T.: Conjugate Duality and Optimization. No 16 in Conference Board of Math. Sciences Series, pp. 1–79. SIAM Publications (1974)
Ruszczyński, A., Shapiro, A.: Conditional risk map**s. Math. Oper. Res. 31, 544–561 (2006)
Ruszczyński, A., Shapiro, A.: Optimization of convex risk functions. Math. Oper. Res. 31, 433–452 (2006)
Shapiro, A.: Analysis of stochastic dual dynamic programming method. Eur. J. Oper. Res. 209, 63–72 (2011)
Shapiro, A., Dentcheva, D., Ruszczyński, A.: Lectures on Stochastic Programming: Modeling and Theory. SIAM, Philadelphia (2009)
Shapiro, A., Ding, L.: Periodical multistage stochastic programs. SIAM J. Optim. 30, 2083–2102 (2020)
Van Slyke, R.M., Wets, R.J.-B.: L-shaped linear programs with applications to optimal control and stochastic programming. SIAM J. Appl. Math. 17, 638–663 (1969)
Wolkowicz, H., Saigal, R., Vandenberghe, L.: Handbook of Semidefinite Programming. Springer, Berlin (2000)
Acknowledgements
The research of the first author was partially supported by an FGV grant, CNPq grants 401371/2014-0, 311289/2016-9, 204872/2018-9, and FAPERJ grant E-26/201.599/2014. Research of the second author was partially supported by CNPq grant 401371/2014-0.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Lars Grüne.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
To prove (3.40) and (3.53), we will need the following lemma (the proof of (ii) of this lemma was given in [5] for a more general sampling scheme and the proof of (i), that we detail, is similar to the proof of (ii)):
Lemma A.1
Assume that Assumptions (H0), (H1)-Sto, and (H2) hold for StoDCuP. Define random variables \(y_n^k = 1(k \in {\mathcal {S}}_n)\).
-
(i)
Let \(\varepsilon >0\), \(t \in \{1,\ldots ,T\}\), \(n \in {\mathtt{Nodes}(t-1)}\), \(m \in C(n)\), \(i \in \{1,\ldots ,p\}\) and set
$$\begin{aligned} K_{\varepsilon , m , i}=\left\{ k \ge 1 : g_{t i}( x_m^{k}, x_{n}^{k} , \xi _m ) - g_{t i j_t(m)}^{k-1}( x_m^{k}, x_{n}^{k} ) \ge \varepsilon \right\} . \end{aligned}$$Let
$$\begin{aligned} \Omega _0( \varepsilon )=\{ \omega \in \Omega : |K_{\varepsilon , m , i}(\omega )| \text{ is } \text{ infinite } \} \end{aligned}$$and assume that \(\Omega _0( \varepsilon ) \ne \emptyset \). Define on the sample space \(\Omega _0 ( \varepsilon )\) the random variables \({\mathcal {I}}_{\varepsilon , m , i}(j), j \ge 1\), where \({\mathcal {I}}_{\varepsilon , m , i}(1)=\min \{k \ge 1: k \in K_{\varepsilon , m , i}(\omega ) \}\) and for \(j \ge 2\)
$$\begin{aligned} {\mathcal {I}}_{\varepsilon , m , i}(j)=\min \{ k > {\mathcal {I}}_{\varepsilon , m , i}(j-1) : k \in K_{\varepsilon , m , i}(\omega )\}, \end{aligned}$$i.e., \({\mathcal {I}}_{\varepsilon , m , i}(j)(\omega )\) is the index of jth iteration k such that \(g_{t i}( x_m^{k}, x_{n}^{k} , \xi _m ) - g_{t i j_t (m)}^{k-1}( x_m^{k}, x_{n}^{k} ) \ge \varepsilon \). Then random variables \((y_n^{{\mathcal {I}}_{\varepsilon , m , i}(j)})_{j \ge 1}\) defined on sample space \(\Omega _0(\varepsilon )\) are independent, have the distribution of \(y_n^1\) and therefore by the Strong Law of Large numbers we have
$$\begin{aligned} {\mathbb {P}}\left( \lim _{N \rightarrow +\infty } \frac{1}{N} \sum _{j=1}^N y_n^{{\mathcal {I}}_{\varepsilon , m , i}(j)}= {\mathbb {E}}[y_n^1] \right) =1. \end{aligned}$$(A.1) -
(ii)
Let \(\varepsilon >0\), \(t \in \{1,\ldots ,T\}\), \(n \in {\mathtt{Nodes}(t-1)}\), and set
$$\begin{aligned} K_{\varepsilon , n}=\left\{ k \ge 1 : {\mathcal {Q}}_t ( x_{n}^{k} ) - {\mathcal {Q}}_t^k ( x_{n}^{k} ) \ge \varepsilon \right\} . \end{aligned}$$Let
$$\begin{aligned} \Omega _1( \varepsilon )=\{\omega \in \Omega : |K_{\varepsilon , n}(\omega )| \text{ is } \text{ infinite } \} \end{aligned}$$and assume that \(\Omega _1( \varepsilon ) \ne \emptyset \). Define on the sample space \(\Omega _1( \varepsilon )\) the random variables \({\mathcal {I}}_{\varepsilon , n}(j), j \ge 1\), where \({\mathcal {I}}_{\varepsilon , n}(1)=\min \{k \ge 1: k \in K_{\varepsilon , n}(\omega ) \}\) and for \(j \ge 2\)
$$\begin{aligned} {\mathcal {I}}_{\varepsilon , n}(j)=\min \{ k > {\mathcal {I}}_{\varepsilon , n}(j-1) : k \in K_{\varepsilon , n}(\omega )\}, \end{aligned}$$i.e., \({\mathcal {I}}_{\varepsilon , n}(j)(\omega )\) is the index of jth iteration k such that \({\mathcal {Q}}_t ( x_{n}^{k} ) - {\mathcal {Q}}_t^k ( x_{n}^{k} ) \ge \varepsilon \). Then random variables \((y_n^{{\mathcal {I}}_{\varepsilon , n}(j)})_{j \ge 1}\) defined on sample space \(\Omega _1( \varepsilon )\) are independent, have the distribution of \(y_n^1\) and therefore by the Strong Law of Large numbers we have
$$\begin{aligned} {\mathbb {P}}\left( \lim _{N \rightarrow +\infty } \frac{1}{N} \sum _{j=1}^N y_n^{{\mathcal {I}}_{\varepsilon , n}(j)}= {\mathbb {E}}[y_n^1] \right) =1. \end{aligned}$$(A.2)
Proof
(i) Define on the sample space \(\Omega _0( \varepsilon )\) the random variables \((w_{\varepsilon , m , i}^k)_k\) by
To alleviate notation (\(\varepsilon , m , n, i\) being fixed), let us put \(w^k := w_{\varepsilon , m , i}^k\), \({\mathcal {I}}(j):={\mathcal {I}}_{\varepsilon , m , i}(j)\), For \({{\overline{y}}}_j \in \{0,1\}\), we have
Observe that the event \({\mathcal {I}}(j) = {\overline{{\mathcal {I}}}}_j\) can be written as the union \(\bigcup _{1 \le {\overline{{\mathcal {I}}}}_1< {\overline{{\mathcal {I}}}}_2< \ldots < {\overline{{\mathcal {I}}}}_j} E({\overline{{\mathcal {I}}}}_1,\ldots ,{\overline{{\mathcal {I}}}}_j)\) of events
Due to Assumption (H2) observe that random variable \(y_n^{{\overline{{\mathcal {I}}}}_j}\) is independent of random variables \(w^{i}, i=1,\ldots ,\overline{{\mathcal {I}}}_j\), and therefore events \(\{y_n^{{\overline{{\mathcal {I}}}}_j} = {{\overline{y}}}_j \}\) and \(\{ {\mathcal {I}}(j) = {\overline{{\mathcal {I}}}}_j \}\) are independent which gives
where we have used the fact that \(y_n^{1}\) and \(y_n^{{\overline{{\mathcal {I}}}}_j}\) have the same distribution (from (H2)).
Next for \({{\overline{y}}}_1,\ldots ,{{\overline{y}}}_p \in \{0,1\}\), we have
By the same reasoning as above, the event
can be expressed in terms of random variables \(y_n^{{\overline{{\mathcal {I}}}}_1},\ldots ,y_n^{{\overline{{\mathcal {I}}}}_{p-1}},w_n^{{\overline{{\mathcal {I}}}}_{1}},\ldots ,w_n^{{\overline{{\mathcal {I}}}}_{p}}\), and is therefore independent of event \(\{ y_n^{{\overline{{\mathcal {I}}}}_p} = {{\overline{y}}}_p \}\). It follows that
By induction this implies
which shows that random variables \((y_n^{{\mathcal {I}}(j)})_{j \ge 1}\) are independent.
The proof of (ii) is similar to the proof of (i). \(\square \)
Proof of (3.40) and (3.53). As in [5], we can now use the previous lemma to prove (3.40) and (3.53). Let us prove (3.40). By contradiction, assume that (3.40) does not hold. Then, there is \(\varepsilon >0\) such that the set \(\Omega _0( \varepsilon )\) defined in Lemma A.1 is nonempty. By Lemma A.1, this implies that (A.1) holds. But due to (3.39), only a finite number of indices \({{\mathcal {I}}_{\varepsilon , m , i}(j)}\) can be in \({\mathcal {S}}_n\) (with corresponding variable \(y_n^{{{\mathcal {I}}_{\varepsilon , m , i}(j)}}\) being one) and therefore \({\mathbb {P}}\left( \lim _{N \rightarrow +\infty } \frac{1}{N} \sum _{j=1}^N y_n^{{\mathcal {I}}_{\varepsilon , m , i}(j)}= 0 \right) =1\), which is a contradiction with (A.1).
The proof of (3.53) is similar to the proof of (3.40), by contradiction and using (3.52) and Lemma A.1-(ii) (see also [5, 6]). \(\square \)
Rights and permissions
About this article
Cite this article
Guigues, V., Monteiro, R.D.C. Stochastic Dynamic Cutting Plane for Multistage Stochastic Convex Programs. J Optim Theory Appl 189, 513–559 (2021). https://doi.org/10.1007/s10957-021-01842-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10957-021-01842-x