ImageAI — image 인식을 몇 줄의 코드로 할 수 있단다…

Joe

Published in

Epopcon Data Science & Engineering

16 min readJul 24, 2018

ImageAI

OlafenwaMoses/ImageAI

ImageAI - A python library built to empower developers to build applications and systems with self-contained Computer…

github.com

위 흑형이 자기 브라더와 같이 만든 일종의 image 인식 library(?)이다. tensorflow로 되어 있다고는 하는데 사용할 때는 tensorflow coding을 하지는 않는다…

image 인식 network 자체를 저 분이 만든건 아니고 잘 알려진 image 인식 network을 다 넣어 사용하기 쉽게 lib.으로 만들어 주신것. 제공되는 network은 SqueezeNet (DeepScale, University of California, Berkeley, and Stanford University 에서 만든 이미지 인식 NN), ResNet50 by Microsoft, InceptionV3 by Google, DenseNet121 by Facebook 이렇게 4가지 쟁쟁한 network을 사용할 수 있다.

누구라도 CNN으로 바닥부터 만들어서 위와 같은 성능을 낼 수 없을 듯 하니, image 인식을 만들어 볼 생각하기 보다는 그냥 이걸 써보자.

ImageAI lib.을 사용하면 위 4가지 network으로 ImageNet (http://www.image-net.org/)의 1,400만 이미지를 이미 training한 model을 다운로드 받아서 사용할 수도 있고, 또한 training image를 마련해서 위 network으로 직접 training 해서 나만의 용도로 classification하는 model을 만들 수도 있다.

설치

python 3.5 이상하고 tensorflow 1.4이상에다 다음 module들을 설치하고…

pip3 install numpy 
pip3 install scipy 
pip3 install opencv-python 
pip3 install pillow 
pip3 install matplotlib 
pip3 install h5py 
pip3 install keras

ImageAI를 설치하면 됨

pip3 install https://github.com/OlafenwaMoses/ImageAI/releases/download/2.0.2/imageai-2.0.2-py3-none-any.whl

Image Prediction

training된 model을 load해서 image 인식.

from imageai.Prediction import ImagePrediction
import os
execution_path = os.getcwd()prediction = ImagePrediction()
prediction.setModelTypeAsDenseNet()
prediction.setModelPath(
   os.path.join(execution_path, "DenseNet-BC-121-32.h5"))prediction.loadModel()predictions, probabilities = prediction.predictImage(
   os.path.join(execution_path, "n.jpg"), result_count=5 )for eachPrediction, eachProbability in zip(predictions, probabilities):
    print(eachPrediction + " : " + eachProbability)

위 코드를 실행하면… 아래와 같이 인식 결과를 준다.

running_shoe : 97.33303785324097
clog : 1.956072635948658
Loafer : 0.23713677655905485
sock : 0.14817804330959916
sandal : 0.11422728421166539

코드는 너무 단순하여 설명할 게 없을 것 같네… 다만 prediction.setModelTypeAsDenseNet() 이 부분은 DenseNet을 사용한다고 지정하는 것이고 DenseNet으로 ImageNet의 image를 가지고 이미 training한 model file “DenseNet-BC-121–32.h5"를 load해서 사용한다는 것을 지정하는 부분만 보면 될 듯.

training된 model들은 아래 link에서 다운받으면 되고 model에 따라 setModelTypeAsXXXXXX() 부분을 변경하면 되는 것이고.

SqueezeNet (Size = 4.82 mb, fastest prediction time and moderate accuracy)
ResNet50 by Microsoft Research (Size = 98 mb, fast prediction time and high accuracy)
InceptionV3 by Google Brain team (Size = 91.6 mb, slow prediction time and higher accuracy)
DenseNet121 by Facebook AI Research (Size = 31.6 mb, slower prediction time and highest accuracy)

일반적인 image 인식은 그냥 저렇게 model 다운로드 받아서 코드 몇 줄 타이핑하면 된다. 너무 하네… x.x ImageAI github에 가면 multi-thread로 prediction하는 등 몇 가지 guide가 있으니 필요하면 가서 봄.

Custom Training

training할 이미지와 label을 준비하고 나만의 model을 training 하는 방법.

from io import open
import requests
import shutil
from zipfile import ZipFile
import os
from imageai.Prediction.Custom import ModelTrainingDATASET_DIR = './fashion'
model_trainer = ModelTraining()
model_trainer.setModelTypeAsResNet()
model_trainer.setDataDirectory(DATASET_DIR)
model_trainer.trainModel(
   num_objects=3, num_experiments=100, enhance_data=True, 
   batch_size=10, show_network_summary=True)

이게 다임! ResNet으로 training을 하겠다는 거고, ./fashion folder에 training data가 있다는 거를 지정. 아주 약간의 parameter… num_objects : class 갯수, num_experiments : epoch 수, enhance_data : True이면 주어진 training image를 스스로 조금씩 변형하여 training example을 늘려서 training한다고 함, batch_size : batch size이고, show_network_summary : True 이면 CNN 구조를 표시함. (볼 필요 없으면 False로…)

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_2 (InputLayer)            (None, 224, 224, 3)  0                                            
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, 112, 112, 64) 9472        input_2[0][0]                    
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, 112, 112, 64) 256         conv2d_1[0][0]                   
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 112, 112, 64) 0           batch_normalization_1[0][0]      
__________________________________________________________________________________________________
max_pooling2d_1 (MaxPooling2D)  (None, 55, 55, 64)   0           activation_1[0][0]               
__________________________________________________________________________________________________
conv2d_3 (Conv2D)               (None, 55, 55, 64)   4160        max_pooling2d_1[0][0]            
__________________________________________________________________________________________________
batch_normalization_3 (BatchNor (None, 55, 55, 64)   256         conv2d_3[0][0]...

training example folder 구성

training example folder (위 예에서는 ./fashion )는 아래와 같이 구성함. json, models는 자동으로 만들어 지는 것이니 신경 안 써도 되고 train과 test folder를 만들면 됨. 각 folder 아래에는 class 만큼의 폴더를 만들고 label을 folder name으로 함. 각 folder 아래에는 image file을 넣으면 끝. 아래의 folder는 3개의 class (men_bag, men_shoes, women_coat)를 나타냄.

json folder내에는 model_class.json 가 자동생성 되고 내용은 아래와 같다. folder name으로 label을 자동으로 지정함.

{
    "0" : "men_bag",
    "1" : "men_shoes",
    "2" : "women_coat"
}

models folder내에는 training된 model들이 매번 epoch 마다 생성됨. 나중에 prediction할 때 이 들중 하나를 load해서 사용하게 됨. file 이름의 acc 다음의 숫자가 accuracy.

위 code와 folder를 마련하고 실행하면… 아래와 같이 training 진행상황의 로그 화면이 나오면서 training을 하게 됨. 첫 epoch가 종료되고 표시되는 val_acc: 0.2600 이 부분이 accuracy 임. train folder의 image로 train하고 test folder내의 image로 accuracy를 측정하여 표시해주고 있으며 이렇게 training된 model이 위 modes folder내에 ‘model_ex-001_acc-0.260000.h5' file로 저장된 것임. 위 code에서는 100번 epoch를 지정하였으므로 training이 다 되면 100개의 model file이 생성되는데 이 들중에 좋은 acc를 보이는 file을 나중에 인식할 때 사용하면 됨. 통상 100번 file이 제일 좋겠지?

12/13 [==========================>...] - ETA: 1s - loss: 3.1956 - acc: 0.5083  
13/13 [==============================] - 22s 2s/step - loss: 3.0100 - acc: 0.5000 - val_loss: 3.2694 - val_acc: 0.2600
Epoch 2/100
12/13 [==========================>...] - ETA: 0s - loss: 1.8484 - acc: 0.6889
13/13 [==============================] - 6s 483ms/step - loss: 1.8499 - acc: 0.6898 - val_loss: 11.2827 - val_acc: 0.3000
Epoch 3/100
12/13 [==========================>...] - ETA: 0s - loss: 1.1971 - acc: 0.6972
13/13 [==============================] - 6s 474ms/step - loss: 1.2458 - acc: 0.6898 - val_loss: 5.1547 - val_acc: 0.4600
Epoch 4/100
12/13 [==========================>...] - ETA: 0s - loss: 0.8438 - acc: 0.7833
13/13 [==============================] - 6s 474ms/step - loss: 0.8360 - acc: 0.7746 - val_loss: 8.1957 - val_acc: 0.3600
Epoch 5/100
12/13 [==========================>...] - ETA: 0s - loss: 0.7698 - acc: 0.7639
13/13 [==============================] - 6s 473ms/step - loss: 0.7395 - acc: 0.7669 - val_loss: 5.3546 - val_acc: 0.3600
...

Custom Prediction

training한 model을 load해서 image를 인식하는 방법

from imageai.Prediction.Custom import CustomImagePrediction
import os
%pylab inline
import matplotlib.pyplot as plt
import matplotlib.image as mpimgexecution_path = os.getcwd() + '/custom_prediction/'prediction = CustomImagePrediction()
prediction.setModelTypeAsResNet()
prediction.setModelPath(
   os.path.join(execution_path, "model_ex-100_acc-1.000000.h5"))
prediction.setJsonPath(
   os.path.join(execution_path, "model_class.json"))
prediction.loadModel(num_objects=3)file='490577787_1_500.jpg'
predictions, probabilities = prediction.predictImage(
   os.path.join(execution_path, file), result_count=3)img=mpimg.imread(
   os.path.join(execution_path, file))
imgplot = plt.imshow(img)
plt.show()for eachPrediction, eachProbability in zip(predictions, probabilities):
    print(eachPrediction + " : " + eachProbability)