YOLOACT with Jupyter Notebook

A Real-Time Object Segmentation Algorithm to Detect Multiple Objects

Published in

Mediio

3 min readOct 6, 2021

YOLOACT stands for “You Only Look At CoefficienTs”. This is a very useful algorithm that can be used for object detection and segmentation. This algorithm has a high accuracy + 30FPS rate. Most of the object detection models work with a rate of NMS sequence like[35, 36, 30, 37, 18, 27]. But YOLOACT can perform duplicate detections as well.

Note: I'm Using ubuntu 16.04 and a Tesla T4 GPU with anaconda3. For the installations, I recommend you use pip commands. You can also use conda commands, but there can be version conflicts.

For this instance, I'm using conda virtual env with pip commands. To create a virtual environment you have to follow the following steps.

Create a new virtual environment with python 3.6

conda create -n myenv python=3.6

Activate created virtual env

source activate myenv

After activating the virtual environment, the next step is to install the following packages. For that, I'm using a pip package installer.

Install cython and pycocotools

!pip install cython!pip install opencv-python pillow pycocotools matplotlib

2. Install Pytorch 1.0.1 or a higher version and torchvision

!pip install torchvision==0.5.0

Note: I created a separate folder named YoloAct to clone the repository. To create a folder you can use this command.

mkdir YoloAct

The next step is to clone the repository into the YoloAct folder. Special gratitude to the authors: Daniel Bolya, Chong Zhou, Fanyi Xiao, and Yong Jae Lee. To clone the repository, we need to use the following command

!git clone https://github.com/dbolya/yolact.git

Once you cloned the repository, you can find a folder named yoloact. Using the ‘cd’ command, you need to change the directory to yoloact. Inside yoloact, the DCNv2 folder is there. Next, we need to run the setup.py file. The execution is as follows.

!python setup.py build develop

Once the execution is finished, we need to download the yoloact pre-trained models with different FPS rates and mAPs. You can find the models under the GitHub repository.

Note: These weights are pre-trained models, if you wish to do custom training for your dataset, you can find the steps under the GitHub repository. I hope to write a separate article regarding this in the future.

I downloaded “yolact_base_54_800000.pth” weights file. Once everything is completed, we need to move to test our dataset. For the testing, we need to run the “eval.py” file which supports images, videos, or a webcam feed in real-time.

To Process a whole folder of images we need to execute

python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.15 --top_k=15 --images=path/to/input/folder:path/to/output/folder

To Process a video

python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.15 --top_k=15 --video_multiframe=4 --video=input_video.mp4:output_video.mp4

To display webcam feed in real-time. You can access several webcams by enabling the index.

python eval.py --trained_model=weights/yolact_base_54_800000.pth --score_threshold=0.15 --top_k=15 --video_multiframe=4 --video=0

Unfortunately, I couldn't do custom training with my dataset :(

I will do it later…..

Have a nice Day!! :)

YOLOACT with Jupyter Notebook

A Real-Time Object Segmentation Algorithm to Detect Multiple Objects

Written by Romesh Perera