MobilenetSSD : 高速に物体検出を行う機械学習モデル

Published in

axinc

10 min readSep 23, 2020

ailia SDKで使用できる機械学習モデルである「MobilenetSSD」のご紹介です。エッジ向け推論フレームワークであるailia SDKとailia MODELSに公開されている機械学習モデルを使用することで、簡単にAIの機能をアプリケーションに実装することができます。

MobilenetSSDの概要

MobilenetSSDは入力された画像から物体のバウンディングボックスとカテゴリを計算するEndToEndの物体検出モデルです。SingleShotDetectorの物体検出のbackboneをMobilenetとすることで、モバイル向けに最適化された高速な物体検出を実現します。

SSD: Single Shot MultiBox Detector

We present a method for detecting objects in images using a single deep neural network. Our approach, named SSD…

arxiv.org

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

We present a class of efficient models called MobileNets for mobile and embedded vision applications. MobileNets are…

arxiv.org

MobilenetSSDのアーキテクチャ

MobilenetSSDは(3,300,300)の画像を入力として、(1,3000,4)のboxesと(1,3000,21)のscoresを出力します。boxesには(cx,cy,w,h)がデフォルトボックスとの差分値として記載されています。scoresにはVOCの20カテゴリのscoreが記載されています。scoresのうちcat=0はBACKGROUNDとして予約されています。

SSDでは、任意のbackboneでFeatureを抽出した後、Extra Feature Layersで解像度を落としながら、各解像度でバウンディングボックスを計算します。MobilenetSSDでは、6段階の解像度の出力をConcatして、合計で3000のバウンディングボックスを計算します。最後に、NMSで重複除外します。

MobilenetSSDのConfigは下記になります。SSDSpecに、各解像度ごとのデフォルトボックスが定義されます。

image_size = 300
image_mean = np.array([127, 127, 127]) # RGB layout
image_std = 128.0
iou_threshold = 0.45
center_variance = 0.1
size_variance = 0.2
specs = [
SSDSpec(19, 16, SSDBoxSizes(60, 105), [2, 3]),
SSDSpec(10, 32, SSDBoxSizes(105, 150), [2, 3]),
SSDSpec(5, 64, SSDBoxSizes(150, 195), [2, 3]),
SSDSpec(3, 100, SSDBoxSizes(195, 240), [2, 3]),
SSDSpec(2, 150, SSDBoxSizes(240, 285), [2, 3]),
SSDSpec(1, 300, SSDBoxSizes(285, 330), [2, 3])
]

qfgaohao/pytorch-ssd

MobileNetV1, MobileNetV2, VGG based SSD/SSD-lite implementation in Pytorch 1.0 / Pytorch 0.4. Out-of-box support for…

github.com

SSDSpecは下記で定義されます。

SSDSpec = collections.namedtuple(‘SSDSpec’, [‘feature_map_size’, ‘shrinkage’, ‘box_sizes’, ‘aspect_ratios’])

SSDSpec(19, 16, SSDBoxSizes(60, 105), [2, 3])の場合、60x60と105x105のサイズのボックスと、アスペクト2の120x60、60x120、210x105、105x210の6つのボックスが定義されます。

qfgaohao/pytorch-ssd

You can't perform that action at this time. You signed in with another tab or window. You signed out in another tab or…

github.com

6段階の認識結果がConcatされ、合計で3000のバウンディングボックスが生成されます。

ailia SDKからMobilenetSSDを使用する

ailia SDKで使用するサンプルは下記になります。

axinc-ai/ailia-models

Ailia input shape(1, 3, 300, 300) Range:[0, 1] Automatically downloads the onnx and prototxt files on the first run. It…

github.com

下記のコマンドでWEBカメラに対してMobilenetSSDを実行可能です。

python3 mobilenet_ssd.py -v 0

MobilenetSSDを独自にデータセットで学習する

MobilenetSSDを使用して学習を行うには下記のpytorch-ssdを使用します。

qfgaohao/pytorch-ssd

This repo implements SSD (Single Shot MultiBox Detector). The implementation is heavily influenced by the projects…

github.com

pytorch-ssdではDataLoaderにLambdaを使用しているため、Windowsでは学習できません。LinuxもしくはMacを使用する必要があります。

Can't pickle local object 'DataLoader.init. . '

Hi all, I hope everybody reading this is having a great day. So I have a problem with torchvision.transforms.Lambda()…

discuss.pytorch.org

学習データのフォーマットはopen-image-dataset formatになります。学習には下記の4つのファイルが必要です。

/dataset/open_images_mixed/sub-test-annotations-bbox.csv
/dataset/open_images_mixed/sub-train-annotations-bbox.csv
/dataset/open_images_mixed/train/images.jpg
/dataset/open_images_mixed/test/images.jpg

csvのフォーマットは下記になります。

ImageID,Source,LabelName,Confidence,XMin,XMax,YMin,YMax,IsOccluded,IsTruncated,IsGroupOf,IsDepiction,IsInside,id,ClassName

ImageIdには画像のファイル名（拡張子なし）、Xmin〜YMaxにはバウンディングボックスを0〜1で記載します。ClassNameにカテゴリを記載します。例えば、下記のように設定します。

img_591,xclick,/m/0gxl3,1,0.40920866666666667,0.08862621809744783,0.7894286666666666,0.6620986078886312,0,0,0,0,0,/m/0gxl3,Handgun

学習画像は、trainフォルダのImageId.jpgが参照されるため、trainフォルダに配置します。

学習は転移学習で行うため、学習済みモデルをダウンロードします。

wget -P models https://storage.googleapis.com/models-hao/mb2-ssd-lite-mp-0_686.pth

学習します。

python3 train_ssd.py — dataset_type open_images — datasets ./dataset — net mb2-ssd-lite — pretrained_ssd models/mb2-ssd-lite-mp-0_686.pth — scheduler cosine — lr 0.001 — t_max 100 — validation_epochs 5 — num_epochs 100 — base_net_lr 0.001 — batch_size 5

modelsフォルダに学習結果とopen-images-model-labels.txtが出力されます。MacBookPro13のCPUだと学習に概ね38時間かかります。

学習結果を確認します。

python3 run_ssd_example.py mb2-ssd-lite models/mb2-ssd-lite-Epoch-80-Loss-2.4882763324521524.pth models/open-images-model-labels.txt input.jpg

ailia SDKではopset=10でエクスポートする必要があるため、convert_to_caffe2_models.pyのtorch.onnx.exportにopset_version=10を追加しておきます。

torch.onnx.export(net, dummy_input, model_path, verbose=False, output_names=[‘scores’, ‘boxes’], opset_version=10)

ailia SDKで使用できるようにONNXにエクスポートします。

python3 convert_to_caffe2_models.py mb2-ssd-lite models/mb2-ssd-lite-Epoch-80-Loss-2.4882763324521524.pth models/open-images-model-labels.txt

学習からONNXへの変換まで行うサンプルは下記を参照ください。

axinc-ai/mobilenetssd-face

Pytorch 1.0 Windows is not working…

github.com

ax株式会社はAIを実用化する会社として、クロスプラットフォームでGPUを使用した高速な推論を行うことができるailia SDKを開発しています。ax株式会社ではコンサルティングからモデル作成、SDKの提供、AIを利用したアプリ・システム開発、サポートまで、 AIに関するトータルソリューションを提供していますのでお気軽にお問い合わせください。

参考記事

YOLO v3 : 物体の位置と種類を検出する機械学習モデル

YOLO v4 : 物体を検出する機械学習モデル

M2Det : 高精度な物体検出モデル

MobilenetSSD : 高速に物体検出を行う機械学習モデル

MobilenetSSDの概要

SSD: Single Shot MultiBox Detector

We present a method for detecting objects in images using a single deep neural network. Our approach, named SSD…

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

We present a class of efficient models called MobileNets for mobile and embedded vision applications. MobileNets are…

MobilenetSSDのアーキテクチャ

qfgaohao/pytorch-ssd

MobileNetV1, MobileNetV2, VGG based SSD/SSD-lite implementation in Pytorch 1.0 / Pytorch 0.4. Out-of-box support for…

qfgaohao/pytorch-ssd

You can't perform that action at this time. You signed in with another tab or window. You signed out in another tab or…

ailia SDKからMobilenetSSDを使用する

axinc-ai/ailia-models

Ailia input shape(1, 3, 300, 300) Range:[0, 1] Automatically downloads the onnx and prototxt files on the first run. It…

MobilenetSSDを独自にデータセットで学習する

qfgaohao/pytorch-ssd

This repo implements SSD (Single Shot MultiBox Detector). The implementation is heavily influenced by the projects…

Can't pickle local object 'DataLoader.__init__. . '

Hi all, I hope everybody reading this is having a great day. So I have a problem with torchvision.transforms.Lambda()…

axinc-ai/mobilenetssd-face

Pytorch 1.0 Windows is not working…

参考記事

Written by Kazuki Kyakuno

Can't pickle local object 'DataLoader.init. . '