Udacity 自駕車 project 4
Project 4: Behavior Cloning
Github: Project 4_Behavioral-Cloning
大致介紹了:
1. Keras應用
2. Transfer Learning
3. CNNs History and introduction
4. Behavior Cloning project
— — — — — — — — — — — — — — — — — —
1. Keras應用
在上一個project 3用了tensorflow來刻出CNN,這次介紹了使用keras來更有效率的刻出CNN。
簡單比較之前用tensorflow 和 這次用keras,可以發現keras方便很多:
2. Transfer Learning
Case 1: Small Data Set, Similar Data
If the new data set is small and similar to the original training data:
簡單來說,保留前面CNN的weight,修改後面FCN的輸出數量並重新計算weight
- slice off the end of the neural network
- add a new fully connected layer that matches the number of classes in the new data set
- randomize the weights of the new fully connected layer; freeze all the weights from the pre-trained network
- train the network to update the weights of the new fully connected layer
Case 2: Small Data Set, Different Data
If the new data set is small and different from the original training data:
- slice off most of the pre-trained layers near the beginning of the network
- add to the remaining pre-trained layers a new fully connected layer that matches the number of classes in the new data set
- randomize the weights of the new fully connected layer; freeze all the weights from the pre-trained network
- train the network to update the weights of the new fully connected layer
Case 3: Large Data Set, Similar Data
If the new data set is large and similar to the original training data:
- remove the last fully connected layer and replace with a layer matching the number of classes in the new data set
- randomly initialize the weights in the new fully connected layer
- initialize the rest of the weights using the pre-trained weights
- re-train the entire neural network
Case 4: Large Data Set, Different Data
If the new data set is large and different from the original training data:
- remove the last fully connected layer and replace with a layer matching the number of classes in the new data set
- retrain the network from scratch with randomly initialized weights
- alternatively, you could just use the same strategy as the “large and similar” data case
3. CNN introduction
先介紹了imagenet的功用(一個巨大的dataset)
甚至可以透過Keras Applications直接導入已經pre-trained的知名object detection models, 例如 ResNet50, VGG16, InceptionV3…
AlexNet:類似LeNet的架構
VGG16/19 (數字代表層數):使用多個3*3 convolution & pooling 作為架構
可以簡單透過keras import VGG16
from keras.applications.vgg16 import VGG16
model = VGG16(weights='imagenet', include_top=False)
GoogLeNet/Inception:導入inception module的概念
GoogLeNet 特點是速度很快,因此可應用在自駕車上
from keras.applications.inception_v3 import InceptionV3
model = InceptionV3(weights='imagenet', include_top=False)
ResNet: 透過增加層數(152 layers)來達到更好的低錯誤率的表現
但一般來說增加層數反而會使網路退化(Degradation problem),深度增加時,網絡準確度出現飽和,甚至出現下降。
那ResNet怎麼解決這個問題的呢? Ans: 殘差學習 Residual learning
ResNet網絡是參考了VGG19網絡,在其基礎上進行了修改,並通過短路機制加入了殘差單元
詳情請見:ResNet intro
from keras.applications.resnet50 import ResNet50
model = ResNet50(weights='imagenet', include_top=False)
4. Behavior Cloning project
最後就進入我們的專案了!
這次目標是要讓車子自己學習在路上跑,不要超過車道邊緣
這主要就是要累積一堆讓車子盡量維持在中間的影像data,udacity的project引導都有提到,但實作起來還是花了一些時間
在做這個專案,建構CNN反而不難,麻煩的是需要自己累積data,並不斷在模擬器上測試。在訓練&測試的過程中,車子常常在影片第46秒那個轉彎開出跑道,因此又特別針對那個地方做了許多的圖片和方向盤的資料擷取,最後終於訓練出完整跑完整條道路的模型。
— — Additional Resources on Deep Learning — —
Behavioral Cloning
The below paper shows one of the techniques Waymo has researched using imitation learning (aka behavioral cloning) to drive a car.
ChauffeurNet: Learning to Drive by Imitating the Best and Synthesizing the Worst
Object Detection and Tracking
The below papers include various deep learning-based approaches to 2D and 3D object detection and tracking.
SSD: Single Shot MultiBox Detector
1. SSD結合了YOLO中的回歸思想和Faster-RCNN中的Anchor機制,使用全圖各個位置的多尺度區域特徵進行回歸,既保持了YOLO速度快的特性,也保證了窗口預測的跟Faster-RCNN一樣比較精準。
2. SSD的核心是在特徵圖上採用卷積核來預測一系列Default Bounding Boxes的類別、坐標偏移。為了提高檢測準確率,SSD在不同尺度的特徵圖上進行預測。
VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection
Semantic Segmentation
The below paper concerns a technique called semantic segmentation, where each pixel of an image gets classified individually!
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation