Reinforcement learning has exhibited notable empirical success in approximating solutions to the Hamilton-Jacobi-Bellman (HJB) equation, consequently generating highly dynamic controllers. However, the inability to bind the suboptimality of resulting controllers or the approximation quality of the true cost-to-go function due to finite sampling and function approximators has limited the broader application of such methods.  Consequently, research efforts have intensified towards developing methods that offer guarantees in this regard. Various approaches have been explored, including lower bounding the value function, relaxing the HJB equation, and considering both discrete and continuous-time systems. In recent…

AI Books

WP Twitter Auto Publish Powered By :