Applying unweighted least-squares based techniques to stochastic dynamic programming: Theory and application

Forootani, A.; Iervolino, R.; Tipaldi, M.

doi:10.1049/iet-cta.2019.0289

Big data and the curse of dimensionality are common vocabularies that researchers in different communities have recently been dealing with, e.g. dynamic programming (DP) in automatic control system society. A novel unweighted sampled based least square projection approach is proposed in this study to address the issue of the large state space in the DP optimisation problem. The method, in particular, takes into account both contraction mapping and monotonicity properties of the DP algorithm for value function approximation. Specifically, the batch of samples are gathered by uniform probability distribution at first, and an unweighted LS sub-problem in the subspace is solved. As the case study, a new Markov decision process model associated with a resource allocation problem is considered to illustrate the technique and evaluate its effectiveness. It is noted that the approach can be employed for different applications as well. Moreover, a MATLAB based software is developed to implement and examine different parts of the proposed method. Simulation examples are considered to support the results of the approach via developed software. The idea makes a connection between the recent advances in big data analysis and approximate DP as well.

Applying unweighted least-squares based techniques to stochastic dynamic programming: Theory and application / Forootani, A.; Iervolino, R.; Tipaldi, M.. - In: IET CONTROL THEORY & APPLICATIONS. - ISSN 1751-8644. - 13:15(2019), pp. 2387-2398. [10.1049/iet-cta.2019.0289]