Modeling for Reinforcement Learning and Optimal Control: Double pendulum on a cart

Basic system outline

We briefly summarize the features/requirements that we would like to implement for the model:

  1. The system has a cart that can move linearly along the x-axis.
  2. The cart can be controlled by a limited force $f$ that acts along the x-axis.
  3. In the center of the cart a pole/pendulum is attached by an uncontrolled revolute joint.
  4. At the end of the first pole, a second pole is connected to the first pole by a second uncontrolled revolute joint.
  5. The system should have realistic physical behavior, i.e. we use physical parameters like mass, momentum.
  6. We would like to be able to simulate the system from a given, or randomly chosen state $x_0$.

The state of the system

With the model equations we want to describe mathematically how the system evolves with time. But first we need to define the state of the system which describes the situation in which the system currently is. The state has to have Markov property which says something like: you should be able to predict future states, when you have only the current state given. We need this property to be able to simulate the system.

The dynamical system equations

Now follows the hard part! We use the Euler-Lagrange equations [ wikipedia] to model the system dynamics. The Euler-Lagrange method is an energy based method that is a bit easier and requires less thinking than for example the (recursive) Newton-Euler method. You can apply this method quiet programmatically to many types of systems.

Simulation code

And here are the Python and Matlab implementations to simulate the system starting from a random state $x_0$ for 8 seconds.





Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store


#MachineLearning #OptimalControl #AI @ComputerScientist