工场首页  管理中心   开发者   工作室   产品   项目   搜索

Model-based dual heuristic dynamic programming

(MB-DHP) is a popular approach in approximating optimal solutions in

control problems. Yet, it usually requires offline training for the model

network, and thus resulting in extra computational cost. In this brief, we

propose a model-free DHP (MF-DHP) design based on finite-difference

technique. In particular, we adopt multilayer perceptron with one hidden

layer for both the action and the critic networks design, and use delayed

objective functions to train both the action and the critic networks

online over time. We test both the MF-DHP and MB-DHP approaches

with a discrete time example and a continuous time example under

the same parameter settings. Our simulation results demonstrate that

the MF-DHP approach can obtain a control performance competitive

with that of the traditional MB-DHP approach while requiring less

computational resources.