Real-time object detection & deployment using Tensorflow, Keras and AWS EC2 instance

Shirish Gupta
The Startup
Published in
6 min readMar 22, 2020

This story is a joint effort of Abhijeet Biswas and Shirish Gupta

Living in a metropolitan, we often find searching for a parking-slot a real nightmare. More often than not we have no clue whether a parking slot will be available in a busy mall, office building or stadium. And even if it is available— it takes ages to locate one while spiralling down multiple levels. Sometimes this problem is reduced by employing an enormous pool of security guards who help you navigate the complex labyrinth. But not only is it expensive, but also extremely inefficient in the long run. So we decided to create our own car counter using basic principles of computer vision, deploy it on EC2 instance and test via live streaming through our mobile phone.

The Beginning

The first step was to collect the relevant data — videos from an actual parking lot. We wanted to keep it as real as possible and therefore didnt scourge the internet for dummy data from Kaggle or other data repositories but instead created one for ourselves. We recorded a bunch of sample videos from our apartment parking lot and used them for our models.

Video of car parking as seen from balcony
My Apartment’s Car Parking

Perhaps — creating the model was the easiest part of the whole process. There are so many open-source resources available online that you don’t need to re-invent the whole wheel. For the purpose of this article, we will skip the theory but promise to publish another article on the same.

Model Architecture

There are multiple object detection models available online such R-CNN, Fast R-CNN, Mask R-CNN, YOLO, etc. But after after going back and forth on multiple models, we decided to use YOLO-v3-tiny. It’s a version of YOLO (You only look once) but light and fast. Unlike it’s big brother, it has been designed to work on processors lacking GPU. It has been trained on PASCAL VOC and COCO dataset and can successfully classify 29 different objects.

YOLO-v3-tiny works extremely good in detecting cars from all camera angles

Let’s start coding

To alleviate our long coding woes, a new python library called “cvlib” is available which drastically reduces the number of code-lines written. Also install “FastAPI” for deployment purpose.

Install the following libraries if not installed —

pip install opencv-python tensorflow
pip install cvlib
pip install FastAPI

All About the FastAPI app

We plan to deploy it on HTTP web server and therefore we create a FastAPI app which does the following:

  1. Receives a photo and converts to np.array
  2. Does couple of preprocessing on the photo

Pass it through the cvlib function to detect the number of cars in the image

  1. Returns the number of cars detected
import numpy as np
import sys, os
from fastapi import FastAPI, UploadFile, File
from starlette.requests import Request
import io
import cv2
import cvlib as cv
from cvlib.object_detection import draw_bbox
from pydantic import BaseModel
app = FastAPI(__name__)class ImageType(BaseModel):
url: str
@app.get(“/”)
def home():
return “Home”
@app.post(“/predict/”)
def prediction(request: Request,
file: bytes = File(…)):
if request.method == “POST”:
image_stream = io.BytesIO(file)
image_stream.seek(0)
file_bytes = np.asarray(bytearray(image_stream.read()), dtype=np.uint8)
frame = cv2.imdecode(file_bytes, cv2.IMREAD_COLOR)
bbox, label, conf = cv.detect_common_objects(frame)
output_image = draw_bbox(frame, bbox, label, conf)
num_cars = label.count(‘car’)
print(‘Number of cars in the image is ‘+ str(num_cars))
return {“num_cars”:num_cars}
return “No post request found”

We can run this app in our local machines but what’s the fun in doing that. To see the real potential of the algorithm we need to make it accessible to everyone and scalable. So we decide to do two things:

  1. Create a connection with mobile phone do a live test using its camera.
  2. Deploy on AWS EC2 instance

All about IPWebcam

Within 10 minutes of research we found an app “IPWebcam” which lets you create a live connection and stream the mobile video on your laptop screen. Just download it from Google playstore and give all the necessary permissions. Scroll down to “Start server” and copy the relevant IPv4 address in your code. Make sure that both your laptop and mobile are connected to the same wifi network.

Make sure you copy the correct IPv4

All about AWS EC2 Instance

We have used Amazon Elastic Compute Cloud (Amazon EC2) to deploy our app(more on it to follow). EC2 instance is a web-service that provides secure, resizable compute capacity in the cloud. With minimal effort you can configure and spin up your computing resource (aka instance). There are several tutorials and online videos on how to create an EC2 instance, so we will skip that part.

We have used ‘t2.micro’ type to create our instance, which is available for free. After creating the instance, we also have added port 80 of type HTTP in the inbound section of security groups. This step is crucial otherwise the app won’t be accessible. Once you can SSH into your EC2 instance, make sure that you are using Python 3.x by running

python3 -v

Then run the following commands to install the required packages

sudo apt-get updatesudo apt install python3-pipsudo apt-get install libsm6 libxrender1 libfontconfig1 libice6 nginx gunicornpip3 install uvicorn==0.11.1 cvlib==0.2.3 starlette==0.12.9 opencv_python==4.1.2.30 pydantic==1.3 Pillow==7.0.0 fastapi==0.45.0 numpy==1.18.1 tensorflow gevent

We are using gunicorn and nginx as our servers here. gunicorn is a python Web Server Gateway Interface(WSGI) HTTP server for unix which works with many different web frameworks. It will create a socket which will serve the response to the nginx request. We will use nginx as a proxy server which will face the web and pass the request to our application server (gunicorn).

Using proxy server makes application run faster, reduces downtime, consumes less server resources, and improves security. Don’t get bogged down by the jargons, just look up about web servers and you should be fine.

Now we need to copy the code into our instance. We can use ‘scp’ to copy or use “git clone” if we have uploaded it in a git repository and save it in a folder called “myApp”. We have to run the following commands to configure nginx:

cd /etc/nginx/sites-enabled/
sudo vim myApp

This will open a text editor where we need to enter the following code and save it as “myApp”

server{
listen 80;
server_name YOUR PUBLIC IP;
location / {
proxy_pass http://127.0.0.1:8000;
}
}

Next we need to start the nginx server using command terminal.

sudo service nginx restart

Go to app directory and enter the following to start the application server.

cd myApp
gunicorn -w 4 -k uvicorn.workers.UvicornWorker myApp:app

VOILA — the server has started and we can test the app.

All about final implementation

Finally run the following codes — It will read frames from the live video, flow it through the FastAPI that we created earlier, perform couple of image processing steps and pass it through the final prediction function.

import urllib.request
import requests
ipwebcap_url=’http://192.168.0.104:8080/shot.jpg'
fastapi_post_url = “http://54.147.194.67/predict/"
while True:
imgResp=urllib.request.urlopen(ipwebcap_url)
imgNp=np.array(bytearray(imgResp.read()),dtype=np.uint8)

try:
r = requests.post(fastapi_post_url,files={‘file’:imgNp})
num_cars=r.json()[‘num_cars’]
except Exception as e:
pass
print(“Error: “,e)
num_cars=-999
print(f’Number of cars in the image is {num_cars}’)

And here we have the number of predicted cars in the frame. If you are experiencing low wifi speeds then you may want to reduce the fps(frames per second).

Video of car parking as seen from balcony

Viewers — Have any ideas to improve this or want us to try any new ideas? Please give your suggestions in the comments.

--

--

Shirish Gupta
The Startup

I am an Economist by academia, Data Scientist by profession and a Traveler by heart. Here to write simple, interesting blogs that everyone can read!