Today, countless industrial and technological applications have benefited from the latest advances in the field of Artificial Intelligence (AI).
From mobile applications, that put you in the shoes of Robert Downey Jr. in an Iron Man movie scene, to planes that take off and land autonomously using image recognition algorithms to detect the position of the runway.
However, in most practical applications, the neural networks have been previously trained off-line, and in the deployment and industrialization stage, their parameters, weights, and architecture are frozen.
In other words, these neural networks are static, and their ability to learn during their in-service life has been restricted.
Off-line training has many advantages with respect to on-line training:
- More computing power is available to train large models during the development phase. Usually, the hardware running the final version of the embedded neural network has fewer performances than the hardware used in the training process.
- Overfitting and other well-known issues that appear during the neural network’s training can be detected early in the development phase.
- The predictions of a neural network trained off-line are deterministic, and thus, the validation and verification cycle is much more straightforward for an off-line trained neural network than that of an on-line trained neural network — there are serious issues on how to grant that the neural network predictions do not diverge to wrong values with time.
- No one would want that due to poor learning during the neural network’s in-service life, that it will begin to provide incorrect predictions with a higher failure rate than stipulated. Depending on the application, like in safety-critical systems, these performance deviations might lead to catastrophic outcomes.
Nevertheless, to be able to train a neural network offline, there is an essential requirement — we need to have a sufficiently large database where the relationships between the neural network’s inputs and outputs are unequivocally defined.
However, what happens if, in our particular application, we don’t know a priori what is the relationship between the neural network’s inputs and outputs, and this relationship can only be obtained in real-time by doing a test?
This is the particular case of the neural networks of adaptive controllers, which use the experience acquired in real-time to readjust the internal parameters of the controller’s neural network, to render the response of the controlled vehicle as close as possible to that of a reference system which is considered optimal from the handling qualities’ point of view.
Although you might be familiar with Artificial Neural Networks (ANN), unless you’re a physicist or an engineer, there really isn’t much reason for you to know about adaptive control. I know.
But adaptive controllers are kind of magical. They’re a category of controllers that are really good at adapting over time with an incredible resiliency, and thus very handy for controlling complex systems with unknown dynamics.
Curious? Let’s find out more about the adaptive control world.
Adaptive Control — Mimicking Your Brain’s Learning Abilities
I want you to visualize this scene. Imagine that you go back to your childhood, you are 4 years old now, and your father just has bought you a new bicycle, it is a price.
But there is a problem, this incredible bike does not have training wheels, and you only know how to ride a bike with the safety that those training wheels give you.
Your father has told you that you have to get on that bike and ride it without falling to the ground, on the first try. If not, you will not get the bike (really? Well, just imagine it).
A hard task to do on the first try, don’t you think?
Well, this type of task is exactly what Model Reference Adaptive Control (MRAC) techniques can solve with incredible ease and efficiency — they provide fast adaptation to drive a system’s dynamics close to that of a well-performing reference model.
In fact, our brain uses a learning method somehow similar to that used by the MRAC technique each time you train to learn a new physical activity.
In the previous case where you were meant to learn how to ride a bike, the reference model would be defined by the way you would like to move with that new bike without falling.
The problem is that our brain learns slowly, and the first time we try, we have great chances to not succeed, but, in the case of adaptive controllers like the MRAC, learning takes a very short time, and they are very good at these tasks.
A Proper Learning Strategy is the Name of the Game
Most of the adaptive control techniques use single-layer ANNs to estimate the unknown nonlinearities and malfunctions that could make the system depart from its nominal dynamics.
The output of this single-layer ANN in the adaptive control is usually represented by θ=WΦ(x), being the activation functions and weights denoted by Φ(x) and W respectively.
It’s customary for the activation functions of every adaptive control application to dependent on the system’s state vector x.
Model Reference Adaptive Controller architecture. Image by the author.
As you can imagine seeing the picture above, the learning law applied to update the ANN‘s weights matrix W is the name of the game in adaptive control.
Depending on this law, the closed-loop system — the plant plus the controller — might become stable or unstable, and thus specific design and performance criteria need to be respected when defining the ANN’s learning law.
Unlike off-line learning techniques applied to ANN, like stochastic gradient descent, dropout, normalization, and so on, on-line learning laws for adaptive systems require specific ad-hoc methodologies that depend on the target system’s dynamics.
The most common learning laws used in MRAC are based on the MIT rule plus a robustifying modification term, like the σ, ε, or optimal robust modifications.
A drawback derived from the strict proper stabilization criterion that renders these ANN’s learning laws is that none of them grants the convergence of the ANN’s weights W to the optimal values.
This is, the previous learning laws can grant the stability of the closed-loop system but do not grant the ANN to provide an accurate prediction about the system’s unknown nonlinear dynamics.
To also achieve ANN’s convergence towards a zero prediction error on top of the stabilization for the previous learning laws, control theory provides an additional and necessary condition on the system’s state temporal characteristics, the so-called Persistency of Excitations (PE).
In short, as the name suggests, a signal which is compliant with the PE condition contains some activity in any moving window time frame. Thus, to achieve the PE condition in the system’s state, the reference input shall be also PE.
But there is a big problem with this statement, reference input’s PE requirement has little or none practical application in real life.
For example, in aircraft control applications, reference inputs with PE characteristics may cause nuisance, waste fuel, and may cause undue stress on the aircraft derived from the continuous excitation of the structure’s elastic modes. In real life, it’s nonsense to continuously excite the aircraft’s flight controls.
Furthermore, since the reference inputs for many online applications are event-based and not known apriori, it is often impossible to monitor on-line whether a signal is PE or not.
But back in late 2010, a research paper titled “Concurrent Learning for Convergence in Adaptive Control without
Persistency of Excitation” changed it all.
The idea was simple but really effective.
The Concurrent Learning Concept
The main reason why the standard learning laws require the reference input signal to be PE is that all of them only use present data about the model approximation error (e).
Any information gathered from previous data observations is forgotten and not used at all. This is why ANN’s weights convergence is only granted if the system is persistently excited.
But the Concurrent Learning concept used another approach: why not to use recorded and instantaneous data concurrently for adaptation?
They proposed the following concurrent learning law:
Simple and powerful.
A few years after the paper’s publication, the adaptive control community embraced this concept as the new standard of learning strategies for adaptive control. A game-changer formula.
But I know, a formula might not mean much to you without an illustrative example, so now I will show you how Concurrent Learning can enhance the adaptive controller performances with a curious (and complex) control problem — how to control the wing-rock phenomenon in a fighter aircraft.
The Wing-Rock Phenomenon
The wing-rock is an unsteady aerodynamic phenomenon attributed to a loss of stability in the aircraft’s lateral-directional modes (more prominent in the rolling axis).
One of its main causes is related to the asymmetric twin vortices that can appear over the aircraft’s forebody at high angles of attack switching intermittently from a left-vortex-predominant pattern to a right-vortex-predominant pattern.
This “dancing” vortex condition is somehow similar to the well know Von Karman vortex street aerodynamic phenomenon.
The main driver that triggers the aircraft’s lateral-directional instability is the interaction of these forebody asymmetric vortexes with the flowfield over the aircraft’s wings. That’s why this phenomenon is common among high-performance fighters and usually can appear when the pilot is flying in the high subsonic speed regime while pulling high-g maneuvers.
Going back to our control problem example, the best way to simulate the wing-rock phenomenon is to model the aircraft’s aerodynamic rolling moment as a nonlinear function of the bank angle and the roll rate, using a semi-empirical formulation, so the simplified equations of motion for the aircraft’s rolling axis can be expressed as:
Given the large nonlinearities present in the equations of motion (due to the Δ term), designing a controller that could provide good rolling handling qualities with such a wing-rock phenomenon can be considered an extremely challenging control problem.
So let’s see how an adaptive controller based on the MRAC architecture can perform with the Standard and Concurrent Learning laws.
Even though we will consider large initial errors in the ANN’s weights, you will be amazed by the enhancements that Concurrent Learning can provide to this type of nonlinear system in both performance and ANN’s weights convergence.
Standard .vs. Concurrent Learning Model Reference Adaptive Control
In the following example, I will show you how the MIT + sigma robust modification learning law is not able to drive the ANN’s weights to their optimal values when the reference input excitation stops.
On the contrary, you will see how the same adaptive controller combined with the Concurrent Learning laws is capable of rendering the ANN’s weights converge to the optimal values.
In this exercise, we will assume that the plant’s control power matrix B is known, A and the coefficients of the Wing-Rock rolling moment will be unknown parameters for the controller, and our reference model — that which represents the aircraft’s desired dynamics — will be a second-order stable plant, with supercritical damping ξ=1.0 and a cut-off frequency ω of 3.0 rad/s.
For the sake of simplicity, we will consider the following inertia and aerodynamic characteristics for the plant (aircraft) that yields this simplified nonlinear plant:
Now we will define the coefficients of the simplified destabilizing Wing-Rock rolling moment model as:
With this setup, we can finally simulate the plant dynamics using a simple forward Euler integration scheme to solve numerically the Ordinary Differential Equations (ODE).
But don’t worry buddy, I have done the coding for you.
Hereafter you can find the MATLAB® script, you can use it to play around with the different plant’s parameter values and compare the performance of both controllers.
From this point, we will jump straight to the simulations’ results, so if you are still curious and you want to take a deeper insight at how the controller is designed, you can read this post.
If you run the previous code, you will see how the adaptive controller with Concurrent Learning laws (red-line) has been able to match the dynamics of the reference system (black-line), and what its more relevant, its weights did converge to the optimal values in instants when the reference input did not contain any activity at all!
On the contrary, the adaptive controller (blue-line) — whose weights update strategy was based on the MIT + sigma robust modification laws — did not achieve a good reference model tracking performances, and their weights got stacked on wrong values when the reference input signal (dashed-black-line) was not compliant with the PE requirement.
This is the magic of Concurrent Learning. Try it by yourself, explore different plant dynamics, modify the initial state conditions, and play around with the script. No matter what you do, you will always end up seeing how Concurrent Learning is the best-performer!
Have fun with the code!
Did you like what you have just read? Recommend this post by clicking the heart button so others can see it!
Rodney Rodríguez Robles is an aerospace engineer, cyclist, blogger, and cutting edge technology advocate, living a dream in the aerospace industry he only dreamed of as a kid. He talks about coding, the history of aeronautics, rocket science, and all the technology that is making your day by day easier.