Behavioral Cloning Project
Project-3 must be submitted by each students of Udacity’s Self-Driving Nanodegree Program in order to complete the task teaching computer to drive a car. Below video showing my result where a car (in simulator) can drive by itself. The window with green font on left side showing driving wheel angle being updated in real time based on road condition.
After learning the core concepts of machine learning behind project-2, I proceed to learn framework called Keras. Keras is a high level framework that use tensorflow underneath to do machine learning. Before using Keras, I had to do hard coding from ground up in order to train the model, but with keras, everything become simpler since all I need to do is using a library provided by Keras. The produced code also shorter, but yet more efficient (at least for newbie like me).
So what is behavioral cloning about? If you watch Matrix (the movie), there is a scene where Neo get “uploaded” with martial arts data so he can learn and master the kungfu skills.
Quite similar method applied in behavioral cloning to train a model for car to drive anonymously. A trainer — in this case a person, must drive the car with his / her driving information being recorded. The recorded data containing information of what computer sees on the road (computer vision), and information of steering wheel angle based on car position. For example: when car moving towards edges of road, the person who drive that car will “naturally” move the wheel into opposite direction to keep car position in the middle. When car moving towards left or right turn, the steering wheel will be following. The simulator records all data (images seen on road, and wheels position) to be used as training data.
Several network architecture introduced on this lesson by Udacity: VGG, AlexNet, GoogLeNet, and NVIDIA. These network architecture basically developed based on LeNet. I was using LeNet on my first attempt to train the model but the result was not accurate, so I tried to test using NVIDIA network. Result was pretty good.
Here’s how I use Keras to model above network architecture.
As you can see, with Keras, defining NVIDIA network architecture is very simple. Keras will handle all the required task underneath — including its capabilities to maximize the use of GPU to speed up the training process, so we can focus on logic of our application / research.
My full project available on this github repository, along with review from Udacity. As always, their review is valuable with more details about how to improve project result.
Additionally, here’s the video showing driving experience from inside the car.
Good, but Not Best
On every project, Udacity always give challenge option to produce better project. For behavioral cloning, the second track available that is more difficult than first track. My model is still not good enough for the challenge. You can see in next video, the track is more difficult (mountain area), my model still failed to drive autonomously on that track.
Here’s the video sample when I am driving manually in second track. It is quite difficult.
I tried my best to have 2 laps driving around the mountain without single failure. But the training produced model that is still not good enough to drive autonomously there. As a result, it is still failing even from begining as seen in next video.
I will try to find sometimes in future to improve the model.
Within this module, Udacity also teach about transfer learning. Imagine that training a model would take hours to complete using powerful hardware, and not all people having the same computer power to reach the same result. The resulting model can then be transferred to another training model using several variance (Feature Extraction, Fine Tuning, or Training from Scratch) to solve various other problem set.
Real World Use Case
The fact that traffic accident statistic is pretty high here in Qatar would be a good reason to have self driving car in the future to decrease fatality happened on the road due to human error. But self driving car wouldn’t be the only machine learning application that can contribute to improve quality of human lives. Standford’s vision lab for example, their research lab focuses on two intimately connected branches of vision research: computer vision and human vision. In both fields, they are intrigued by visual functionalities that give rise to semantically meaningful interpretations of the visual world.
That means, one day, through a vision and its intelligence, computer can understand many context including ability to understand human. For example, a computer can be utilized at school / classroom to calculate student’s understanding of given material based on their face expression during class session.
Another example could be related to healthcare field. A friend of mine who had surgery earlier in Doha telling me that the operation room in state-owned hospital is looked like the airplane’s cockpit: its all computerized. Computer vision and AI can help doctor handling difficult operation process in the future.
While machine learning seems intriguing at the begining, when you start understanding the concept, the logic, the coding experience, ML would be much easier even for people without enough background or experience on the field.
I tend to agree on what Jeremy Howard written at his MOOC website,
Making neural nets uncool again.
No, it’s not that difficult to master the skill. I believe if more people understand how to use machine learning technique — especially with availability of high level framework such as Keras, more problem set can be solved together.
Of course we shall be very careful so the world pictured by Matrix doesn’t happen in the future. Because maybe, just maybe…one day, computer or machine, can become smarter than human, and they will, take over, the world…! (dum..dum…., *play horror music here).