Deep Reinforcement Learning in Mobile Robot Navigation Tutorial — Part5: Some Extra Stuff

4 min readNov 6, 2022

In this series of articles, I aimed to go over the python code parts of the repository, explain the code, and give some insights so that it would be easier for others to use, change, refactor, repurpose or disregard it. Mainly, I wished to go over the code implementation of the TD3 neural network and how it connects to the Gazebo simulator. I feel that the previous 4 parts should be sufficient. But there are also some other extra bits and pieces of information, that might help to use the repository better.

Increase the Speed of the Simulation

The way that the neural network training is conducted follows these steps:

Collect motion samples -> Train policy on samples -> Update Policy

The training and updating speed will depend on the capabilities of your hardware, but generally, it should not take more than a second. But what about the collection of samples in the simulation? Each step is propagated with 0.1 second rate and we can have maximum 500 steps (with default parameters) in our implementation. That means a single episode can run for more than 50 seconds. It is quite evident that in order for us to speed up the whole training process, speeding up this part would bring the most benefit. Gazebo simulator innately does not support running simulations faster than real time, but there is a way to ‘overclock’ it.

The simulation can be sped up by changing the <real_time_update_rate> value in your world file. In our repository, you will find the world file in the directory:

DRL-robot-navigation/catkin_ws/src/multi_robot_scenario/launch

The (default) world file is:

TD3.world

Here you can find the <real_time_update_rate> in the physics part of the file:

The update rate is given in milliseconds and by default 1 second in real time is executed as 1000 ms in the simulation. We can increase this number to execute the simulation faster. For instance, (by changing it to 2000) 1 second in real-time will be executed as 2000 ms in simulation.

Note: Use this method at your own discretion. Using too high of a frequency often causes problems in the simulation, as sensors and other plugins just cannot keep up with it. So try it out, and see what frequencies do not cause issues in your implementation.

Starting the Gazebo Simulator Visualization

If the installation was carried out successfully, you should be able to launch the training and see a Rviz window pop up automatically. You will notice, however, the Gazebo simulator does not open. The Gazebo simulator uses up a lot of GPU resources and (depending on how powerful your machine is) may even crash the simulation. So, by default, the simulator GUI is not launched. The Rviz is usually enough to evaluate the progress of the training, but if you wish to launch the Gazebo simulator GUI, there are 2 ways to do it.

1.) Run during runtime:

Launch the training as usual. This will only launch the Rviz as previously explained. Then open a new terminal, and execute the command:

gzclient

This will open the GUI of the currently running Gazebo simulation.

2.) Launch GUI when starting the training

You can automatically start the Gazebo GUI, by changing the default behavior, when launching the training. Open the file empty_world.launch in the directory:

DRL-robot-navigation/catkin_ws/src/multi_robot_scenario/launch

And change the line:

to:

Changing Velodyne Sensor Settings

As you have noticed by now, we use a simulated Velodyne Puck 16-channel LiDAR to record the environment around the robot. Our basic implementation assumes that the laser sensor only has a 180-degree FOV in front of the robot. That means that the robot can only move forward, as it does not see anything behind itself. If a view behind the robot is necessary, the default Velodyne sensor settings should be changed. The configuration file of the Velodyne Puck, you will find here:

DRL-robot-navigation/catkin_ws/src/velodyne_simulator/velodyne_description/urdf

The changeable parameters can be found at the top of the VLP-16.urdf.xacro file:

We set these parameters when we call the plugin in the robot setup. Robot plugin setup is available in:

DRL-robot-navigation/catkin_ws/src/multi_robot_scenario/xacro/p3dx

in file:

pioneer3dx.xacro

The parameters are set in the Velodyne section:

Here, we can set the number of samples we want to receive, the frequency of observations as well as FOV (among other things). The FOV is set in the range from minimum to maximum value, from left to right with respect to the heading of the robot (if you do not change the “rpy” values in origin). These values are set in radians.

Another thing to note is that we can set the origin. This will specify the location of the sensor on the robot and is described in meters with respect to the robot's origin.

Deep Reinforcement Learning in Mobile Robot Navigation Tutorial — Part5: Some Extra Stuff

Written by Reinis Cimurs