This paper presents a data-driven practical stabilization approach for solving stochastic Dynamic Programming problems with unknown Markov Decision Process models over an infinite time horizon. The Bellman operator is modeled as a discrete-time switched affine system, with each mode representing a specific stationary stochastic policy and an external bounded disturbance term to account for such modeling issue. A two-step approach is followed. First, a model-based robust practical stabilization problem is solved to derive stabilization conditions which enable the practical convergence of the resulting closed-loop system trajectories towards a chosen reference value function. Then, by exploiting recent model-to-data Linear Matrix Inequality transformation tools, these results are further developed to obtain data-driven robust stabilization conditions for addressing the case of model-free problems. Such data-driven stabilization conditions are deployed into the Value Iteration algorithm, and finally tested on the recycling robot and the parking lot management problems to demonstrate the effectiveness of the proposed method.

A data-driven practical stabilization approach for solving stochastic dynamic programming problems / Tipaldi, M.; Iervolino, R.; Massenio, P. R.; Forootani, A.. - In: AUTOMATICA. - ISSN 0005-1098. - 178:(2025). [10.1016/j.automatica.2025.112372]

A data-driven practical stabilization approach for solving stochastic dynamic programming problems

Iervolino R.
Membro del Collaboration Group
;
2025

Abstract

This paper presents a data-driven practical stabilization approach for solving stochastic Dynamic Programming problems with unknown Markov Decision Process models over an infinite time horizon. The Bellman operator is modeled as a discrete-time switched affine system, with each mode representing a specific stationary stochastic policy and an external bounded disturbance term to account for such modeling issue. A two-step approach is followed. First, a model-based robust practical stabilization problem is solved to derive stabilization conditions which enable the practical convergence of the resulting closed-loop system trajectories towards a chosen reference value function. Then, by exploiting recent model-to-data Linear Matrix Inequality transformation tools, these results are further developed to obtain data-driven robust stabilization conditions for addressing the case of model-free problems. Such data-driven stabilization conditions are deployed into the Value Iteration algorithm, and finally tested on the recycling robot and the parking lot management problems to demonstrate the effectiveness of the proposed method.
2025
A data-driven practical stabilization approach for solving stochastic dynamic programming problems / Tipaldi, M.; Iervolino, R.; Massenio, P. R.; Forootani, A.. - In: AUTOMATICA. - ISSN 0005-1098. - 178:(2025). [10.1016/j.automatica.2025.112372]
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S0005109825002651-main.pdf

accesso aperto

Licenza: Dominio pubblico
Dimensione 1.08 MB
Formato Adobe PDF
1.08 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11588/1019099
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact