Optimality in natural physics: Revisiting Classic dynamics from Optimal control

A Dynamic Programming approach

Hsieh, Sheng-Han
4 min readMar 12, 2022

Preface

From our previous post, we’ve shown the Lagrange and Hamiltonian mechanics govern the optimal trajectories of the extremum action problem [1]. We also point out an observation that will be the topic of this post.

Classic dynamics system is a solution of an optimal control problem

Optimal control problem (OCP)

A typical optimal control problem is composed of an influenceable dynamic system and a corresponding objective function. The task is to find a control signal (trajectory) “u” that minimized the objective. If you are familiar with control systems, it might seem to be an open-loop approach (in which u is not an explicit function of the current state), but as we march through the analysis, an intriguing hidden structure will be revealed.

A typical setup of OCP without constraint, C is the objective function

Hamilton–Jacobi–Bellman equation (HJB)

Dynamic Programming is a powerful methodology that breaks the original problem into smaller and solvable problems, which is a direct interpretation of the Principle of Optimality [2]:

An optimal policy has the property that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision.

By applying the Dynamic Programming in the OCP described above, we will be awarded the “Hamilton–Jacobi–Bellman equation”.

Simplified derivation of HJB using dynamic programming

Notice the OCP is now replaced with a smaller one with a lower dimension, which is much easier. Assume the problem is indeed solvable with a minimizer “u*” as a function of “J” the optimal cost-to-go. The HJB will become a partial differential equation that shapes “J” the optimal cost-to-go. In practice, a numerical way of solving the “J” potential field can be starting from the termination boundary and reversely propagate back the whole interesting region [3].

The HJB equation with optimal control applied, the blue line is the contour of J, and the orange line is one of the “optimal” trajectories

Classic dynamics as an OCP

In order to set up an OCP which generates classic dynamics, we start with a fully actuated system without any passive dynamics shown in the following equations. The generalized position and velocity in mechanical systems will be well encoded by the state “x”. Readers may interpret such a setup as finding the Equation of Motion (x_dot) which is optimal to the given objective function (time integral of L).

Optimal control problem of a fully actuated system without any passive dynamics

HJB of Classic dynamic

With the same Dynamic programming trick applied to the last OCP above, the resulting HJB and the first order necessity condition can be shown as the following equations. The negative state gradient of the optimal cost-to-go is denoted as “p” and the term Hamiltonian “H” was also introduced over the state space with the optimal control “u*” applied everywhere. As it turns out, the term “p” which works analogously to a Lagrange multiplier is exactly the momentum defined by the partial derivatives of Lagrangian “L”.

The second equation is the corresponding first-order necessity for the minimization of the first HJB equation

The last equation above was held if the optimal control is applied, with an opposite sign of the optimal cost-to-go function (Optimal cost-to-reach, or simply the “Action”), is the well-known Hamilton-Jacobi equation [4].

The Hamiltonian

Now we may further test out different properties of the Hamiltonian defined along with the fully actuated OCP. Start by the partial derivative with respect to a specific momentum (We assume u* may be expressed by p).

Deriving Hamiltonian mechanics from OCP

Then we seek for the total derivative of some momentum (Again we have change our free variables from u* to p).

Deriving Hamiltonian mechanics from OCP

At this point, with the proper expression of objective function “L” and the Hamiltonian “H” by Legendre transformation, we can claim that:

Hamiltonian mechanics is a solution to an optimal control problem

Similarly, the Lagrange mechanics may also be derived following these steps:

Deriving Lagrangian mechanics from OCP

Caveat

In the formation of the OCP for classical dynamics, the generalized position and speed were mixed up in the state “x”, which causes the partial differentiating respect to those variables somewhat tricky.

References

[1] C. Lanczos, The Variational Principles of Mechanics. Dover Publications, 1986.

[2] R. Bellman, R. Corporation, en K. M. R. Collection, Dynamic Programming. Princeton University Press, 1957.

[3] A. E. Bryson en Y. C. Ho, Applied Optimal Control: Optimization, Estimation and Control. Taylor & Francis Group, 2017.

[4] B. Houchmandzadeh, “The Hamilton-Jacobi Equation : an intuitive approach.”, American Journal of Physics, vol 88, no 5, bll 353–359, Apr 2020.

--

--

Responses (1)