Udacity 自駕車 project 4

Kevin Chiu
CodingJourney
Published in
4 min readJul 1, 2020

Project 4: Behavior Cloning

Github: Project 4_Behavioral-Cloning

大致介紹了:

1. Keras應用

2. Transfer Learning

3. CNNs History and introduction

4. Behavior Cloning project

— — — — — — — — — — — — — — — — — —

1. Keras應用

在上一個project 3用了tensorflow來刻出CNN,這次介紹了使用keras來更有效率的刻出CNN。

簡單比較之前用tensorflow 和 這次用keras,可以發現keras方便很多:

2. Transfer Learning

Case 1: Small Data Set, Similar Data

If the new data set is small and similar to the original training data:

簡單來說,保留前面CNN的weight,修改後面FCN的輸出數量並重新計算weight

  • slice off the end of the neural network
  • add a new fully connected layer that matches the number of classes in the new data set
  • randomize the weights of the new fully connected layer; freeze all the weights from the pre-trained network
  • train the network to update the weights of the new fully connected layer

Case 2: Small Data Set, Different Data

If the new data set is small and different from the original training data:

  • slice off most of the pre-trained layers near the beginning of the network
  • add to the remaining pre-trained layers a new fully connected layer that matches the number of classes in the new data set
  • randomize the weights of the new fully connected layer; freeze all the weights from the pre-trained network
  • train the network to update the weights of the new fully connected layer

Case 3: Large Data Set, Similar Data

If the new data set is large and similar to the original training data:

  • remove the last fully connected layer and replace with a layer matching the number of classes in the new data set
  • randomly initialize the weights in the new fully connected layer
  • initialize the rest of the weights using the pre-trained weights
  • re-train the entire neural network

Case 4: Large Data Set, Different Data

If the new data set is large and different from the original training data:

  • remove the last fully connected layer and replace with a layer matching the number of classes in the new data set
  • retrain the network from scratch with randomly initialized weights
  • alternatively, you could just use the same strategy as the “large and similar” data case

3. CNN introduction

先介紹了imagenet的功用(一個巨大的dataset)

甚至可以透過Keras Applications直接導入已經pre-trained的知名object detection models, 例如 ResNet50, VGG16, InceptionV3…

AlexNet:類似LeNet的架構

VGG16/19 (數字代表層數):使用多個3*3 convolution & pooling 作為架構

可以簡單透過keras import VGG16

from keras.applications.vgg16 import VGG16

model = VGG16(weights='imagenet', include_top=False)

GoogLeNet/Inception:導入inception module的概念

GoogLeNet 特點是速度很快,因此可應用在自駕車上

from keras.applications.inception_v3 import InceptionV3

model = InceptionV3(weights='imagenet', include_top=False)

ResNet: 透過增加層數(152 layers)來達到更好的低錯誤率的表現

但一般來說增加層數反而會使網路退化(Degradation problem),深度增加時,網絡準確度出現飽和,甚至出現下降。

那ResNet怎麼解決這個問題的呢? Ans: 殘差學習 Residual learning

ResNet網絡是參考了VGG19網絡,在其基礎上進行了修改,並通過短路機制加入了殘差單元

詳情請見:ResNet intro

from keras.applications.resnet50 import ResNet50

model = ResNet50(weights='imagenet', include_top=False)

4. Behavior Cloning project

最後就進入我們的專案了!

這次目標是要讓車子自己學習在路上跑,不要超過車道邊緣

這主要就是要累積一堆讓車子盡量維持在中間的影像data,udacity的project引導都有提到,但實作起來還是花了一些時間

在做這個專案,建構CNN反而不難,麻煩的是需要自己累積data,並不斷在模擬器上測試。在訓練&測試的過程中,車子常常在影片第46秒那個轉彎開出跑道,因此又特別針對那個地方做了許多的圖片和方向盤的資料擷取,最後終於訓練出完整跑完整條道路的模型。

— — Additional Resources on Deep Learning — —

Behavioral Cloning

The below paper shows one of the techniques Waymo has researched using imitation learning (aka behavioral cloning) to drive a car.

ChauffeurNet: Learning to Drive by Imitating the Best and Synthesizing the Worst

Object Detection and Tracking

The below papers include various deep learning-based approaches to 2D and 3D object detection and tracking.

SSD: Single Shot MultiBox Detector

1. SSD結合了YOLO中的回歸思想和Faster-RCNN中的Anchor機制,使用全圖各個位置的多尺度區域特徵進行回歸,既保持了YOLO速度快的特性,也保證了窗口預測的跟Faster-RCNN一樣比較精準。

2. SSD的核心是在特徵圖上採用卷積核來預測一系列Default Bounding Boxes的類別、坐標偏移。為了提高檢測準確率,SSD在不同尺度的特徵圖上進行預測。

VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection

VoxelNet 透過LiDAR 的 point cloud, 能夠畫出立體框 (3D Object Detection)

Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net

Semantic Segmentation

The below paper concerns a technique called semantic segmentation, where each pixel of an image gets classified individually!

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

--

--