A data-driven practical stabilization approach for solving stochastic dynamic programming problems

Tipaldi, M.; Iervolino, R.; Massenio, P. R.; Forootani, A.

doi:10.1016/j.automatica.2025.112372

This paper presents a data-driven practical stabilization approach for solving stochastic Dynamic Programming problems with unknown Markov Decision Process models over an infinite time horizon. The Bellman operator is modeled as a discrete-time switched affine system, with each mode representing a specific stationary stochastic policy and an external bounded disturbance term to account for such modeling issue. A two-step approach is followed. First, a model-based robust practical stabilization problem is solved to derive stabilization conditions which enable the practical convergence of the resulting closed-loop system trajectories towards a chosen reference value function. Then, by exploiting recent model-to-data Linear Matrix Inequality transformation tools, these results are further developed to obtain data-driven robust stabilization conditions for addressing the case of model-free problems. Such data-driven stabilization conditions are deployed into the Value Iteration algorithm, and finally tested on the recycling robot and the parking lot management problems to demonstrate the effectiveness of the proposed method.

A data-driven practical stabilization approach for solving stochastic dynamic programming problems / Tipaldi, M.; Iervolino, R.; Massenio, P. R.; Forootani, A.. - In: AUTOMATICA. - ISSN 0005-1098. - 178:(2025). [10.1016/j.automatica.2025.112372]