Study notes: An example of Dynamic system imitation by Neural network
Learning Lorenz dynamics from sampled trajectories
Problem formulation
For system engineering, building a mathematical model of a dynamic system is essential to a large class of methodology and beneficial in many other ways. In this example, we seek a neural network mapping that best fits the Lorentz dynamic equations shown below and demonstrates the usage of neural networks on dynamical system prediction. An alternative way at the other sides of the spectrum will be finding sparse representation with a nonlinear library, check here.
Collecting data
The database which will be used to train the network was generated by solving (simulating) the Lorentz equation with several random initial conditions and a certain period of time. Since the dynamic equations are time-independent, it is reasonable to treat a single trajectory data as multiple time-discrete transitions. The simulated trajectories can be found in the following figure plot in a 3D configuration.
As one should notice the database is in fact not well distributed but tends to traverse along a specific area. This predicts a possible bad extrapolation for the network trained by the dataset above. As a comparison, a broader dataset can be collected by a wider spread initial condition which leads to the following figure. We will compare the trained result in the last section of this note.
Start training!
After we’ve done building the dataset, we can start training our four-layer NN utilizing the MATLAB Deep Learning Toolbox. The mean square error(MSE) between the supervisor(data set) and the network prediction converged well after 1000 epochs.
We can also compare the resultant trajectories given by Lorentz and the NN model in state-space for randomly chosen initial conditions. The following figure at the left shows that a NN model indeed captures the interesting part of the Lorenz dynamic.
A simple improvement
As stated previously, the dataset used to train this model only contains a small portion of the state-space. The consequences are obvious if we set the initial condition far from where the data set has visited. A naïve approach to overcome this issue may be introducing wider training data, with the cost of possibly worse MSE error under the same neuron count. The training result (NN model 2) turns out to be around 10 times worse (in the sense of MSE) than the previous one but should potentially handle a wider possible range of states.
Although simply increasing the database range does not fundamentally improve the extrapolation issue. From the simulation results with the same initial condition used above, we can still discover a significant improvement for initial states far away.
Conclusion and Sparse model identification
In this example, we have successfully trained a NN to imitate the dynamic of the Lorentz model. Despite an acceptable result, it leads to a dramatically complicated model compared to our target and usually fails with data points far away from the training set. This significant disadvantage encourages research topics on identification methods using a parsimonious model, which will be our next topic here.
Reference
[1] S. L. Brunton en J. N. Kutz, Data-driven Science and Engineering: Machine Learning, Dynamical Systems, and Control. Cambridge University Press, 2019.