merantix
Published in

merantix

Journey from academic paper to industry usage

What implementing Conditional Imitation Learning taught us about reproducing a paper

Spoiler: Our trained autonomous vehicle stack after successful training

Motivation: Reproducibility of publications

  • Reproduce the paper as a baseline for comparison.
  • Verify your understanding of the paper.
  • Use the research method as a submodule in your own stack (e.g. an OCR model in a document analyzer, or a lane detection model in an autonomous driving system).
  • Check how the algorithm extends to different datasets (often a proprietary dataset you have access to).
  • Extend the model for improved results (e.g. trying a different loss function, or tweaking the network architecture).
End-to-end Conditional imitation learning is a simple way to learn an autonomous driving model, which features a rich network architecture: The main part of the network is mostly convolutional. It is connected to one of multiple different branches.
The network has multiple branches, but only one is active: The image shows an example where the high-level command indicates to go left, therefore only the GO LEFT branch is activated.
  • The observations are the perceptual input (RGB images) from a windshield camera.
  • The high-level commands are one of the following: “follow the lane,” “drive straight at the next intersection,” “turn left at the next intersection,” and “turn right at the next intersection.”
  • The actions are the regression targets (steer, throttle, brake).

Enable batch normalization

Speed-up the training cycle

  1. We will make an adjustment to our training procedure, e.g. enable dropout.
  2. To see how well this adjustment works, we fully train the model and then observe how it behaves in simulation. The entire training takes 27 hours on a Tesla P100 GPU.
  3. We will run the trained model in simulation and see how it deals with different situations.
The paper is accompanied by this input data. The distribution of this training data shown as a Histogram for each of the regression targets. Left: steer, Middle: throttle, Right: brake.
A trained baseline model was published together with the paper. We ran this model and looked at its output distribution. Left: steer, Middle: throttle, Right: brake.
We successfully trained our model for 30,000 training steps. This shows the histogram of targets: Left: steer, Middle: throttle, Right: brake.
We unsuccessfully trained our model for 30,000 training steps. This shows the Histogram of regression targets: Left: steer, Middle: throttle, Right: brake. Training is going to fail because of we are already seeing large absolute steer values. (The steer values should mostly fall between -1 and 1 as seen in the input data distribution.)

Tuning hyperparameters

Prior to the tuning of hyperparameters, we did not adjust the weight for steering, therefore the vehicle is drifting off the street.

Reach out to the authors

Know your data

Sensitivity to initialization

Conclusion

  • Know your data: You can do this by plotting the distributions of the training dataset. You also want to look at some samples (in the case of CIL, training sequences).
  • Speed-up the training cycle: In order to iterate quickly, figure out how you can reduce the training time. Are there any conclusions you can draw before the training has completely finished? Is your training setup easy to work with? For example, how easy is it to trigger a new training? Ideally, triggering a training takes one command line call from your development machine.
  • Pay attention to the details in the paper. If you can’t find something in the paper or the related work cited, try to contact the authors (e.g., via GitHub issues, or email).
  • You might need workarounds to adjust the paper to your own use case. The paper is the result of an academic setup, so many things are not designed for a plug-and-play use in industry (see our related post).
  • We discussed many measures very specific to the CIL paper: We enable batch normalization, we tune hyperparameters, and the sensitivity to initialization. For your distinct paper some challenges will most likely be of different nature.
  • John for writing the initial prototype for the training code
  • Robert, Filippo, Clemens, Mark, and Rasmus for the valuable feedback to the code and this article
  • the CARLA team for providing a practical environment to prototype autonomous driving
  1. Codevilla, F., Müller, M., López, A., Koltun, V. and Dosovitskiy, A., 2018, May. End-to-end driving via conditional imitation learning. In IEEE International Conference on Robotics and Automation (ICRA). IEEE.
  2. Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B., Goyal, P., Jackel, L.D., Monfort, M., Muller, U., Zhang, J. and Zhang, X., 2016. End to end learning for self-driving cars. arXiv preprint arXiv:1604.07316.
  3. Cabi, S., Colmenarejo, S.G., Hoffman, M.W., Denil, M., Wang, Z. and De Freitas, N., 2017. The intentional unintentional agent: Learning to solve many continuous control tasks simultaneously. arXiv preprint arXiv:1707.03300.
  4. Cheng, M.Y., Gupta, A., Ong, Y.S. and Ni, Z.W., 2017. Coevolutionary multitasking for concurrent global optimization: With case studies in complex engineering design. Engineering Applications of Artificial Intelligence, 64, pp.13–24.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Markus Hinsche

You get first hand insides into tech, travels, software engineering, art, machine learning, and data science! Observe! Create! Enjoy!