Optimizing the yolov7 Model using Intel® Extension for PyTorch

7 min readJun 29, 2023

Computer vision has made tremendous strides in recent years, thanks to deep learning models like YOLO (You Only Look Once). YOLO is known for its exceptional real-time object detection capabilities and has become a popular choice for various applications. In this article, we will delve into the process of optimizing the yolov7 model using Intel Pytorch Optimization, then training a YOLOv7 model on a custom dataset, and empowering you to create your own powerful object detection system.

What is yolov7?

YOLOv7 is an extension of the YOLO series of object detection models. It combines the best features from previous versions while introducing improvements in terms of accuracy and speed. YOLOv7 is built on the Darknet framework and utilizes a single neural network to simultaneously predict object-bounding boxes and their associated class probabilities.

Cloning the Yolov7 Repo, and Installing the Required Dependencies/Packages

git clone https://github.com/WongKinYiu/yolov7
cd yolov7
pip install virtualenv
virtualenv venv
venv/Scripts/activate
pip install -r requirements.txt

How to fine-type the existing yolov7 code to add Intel® Extension for PyTorch?

Intel® Extension for PyTorch* extends PyTorch* with up-to-date features optimizations for an extra performance boost on Intel hardware. Optimizations take advantage of AVX-512 Vector Neural Network Instructions (AVX512 VNNI) and Intel® Advanced Matrix Extensions (Intel® AMX) on Intel CPUs as well as Intel Xe Matrix Extensions (XMX) AI engines on Intel discrete GPUs. Moreover, through PyTorch* xpu device, Intel® Extension for PyTorch* provides easy GPU acceleration for Intel discrete GPUs with PyTorch*.

Intel® Extension for PyTorch* provides optimizations for both eager mode and graph mode, however, compared to eager mode, graph mode in PyTorch* normally yields better performance from optimization techniques, such as operation fusion. Intel® Extension for PyTorch* amplifies them with more comprehensive graph optimizations. Therefore we recommend you to take advantage of Intel® Extension for PyTorch* with TorchScript whenever your workload supports it. You could choose to run with torch.jit.trace() function or torch.jit.script() function, but based on our evaluation, torch.jit.trace() supports more workloads so we recommend you to use torch.jit.trace() as your first choice.

The extension can be loaded as a Python module for Python programs or linked as a C++ library for C++ programs. In Python scripts users can enable it dynamically by importing intel_extension_for_pytorch.

Check CPU tutorial for detailed information of Intel® Extension for PyTorch* for Intel® CPUs. Source code is available at the master branch.
Check GPU tutorial for detailed information of Intel® Extension for PyTorch* for Intel® GPUs. Source code is available at the xpu-master branch.

Installation

CPU version

python -m pip install intel_extension_for_pytorch

python -m pip install intel_extension_for_pytorch -f https://developer.intel.com/ipex-whl-stable-cpu

GPU version

python -m pip install torch==1.13.0a0+git6c9b55e intel_extension_for_pytorch==1.13.120+xpu -f https://developer.intel.com/ipex-whl-stable-xpu

Modifying yolov7 code to add Intel® Extension for PyTorch

Inference on CPU

import torch
import torchvision.models as models

model = models.resnet50(pretrained=True)
model.eval()
data = torch.rand(1, 3, 224, 224)

import intel_extension_for_pytorch as ipex
model = ipex.optimize(model)

with torch.no_grad():
  model(data)

Inference on GPU

import torch
import torchvision.models as models

model = models.resnet50(pretrained=True)
model.eval()
data = torch.rand(1, 3, 224, 224)

import intel_extension_for_pytorch as ipex
model = model.to('xpu')
data = data.to('xpu')
model = ipex.optimize(model)

with torch.no_grad():
  model(data)

Optimizing the yolov7 training (train.py) code to use Intel® Extension for PyTorch

Collecting and Annotating the Dataset

To train a YOLOv7 model on a custom dataset, you need a substantial amount of labeled images. The first step is to collect images relevant to your target object detection task. Ensure that the dataset is diverse, containing various backgrounds, lighting conditions, and angles.

Next, you must annotate the dataset by labeling each object of interest in the images. Popular annotation tools include LabelImg, RectLabel, and VIA. Annotating involves drawing bounding boxes around objects and assigning corresponding class labels. Aim for accurate and consistent annotations, as they directly impact the model’s performance.

LabelImg

pip install labelImg
labelImg

Dataset Preparation

Once your dataset is annotated, it needs to be prepared in a format compatible with YOLOv7. YOLOv7 requires data in the Darknet format, which consists of two files: .txt and .names.

The .txt file should accompany each image in the dataset and contain the ground truth annotations in the following format:

<class_label> <x_center> <y_center> <width> <height>

Each value is normalized by the image dimensions, with the class label being an integer index.

The .names file contains a list of class labels, with each label on a separate line.

Configuring the YOLOv7 Model

The next step is to configure the YOLOv7 model for training. The configuration file, yolov7.cfg, defines various parameters such as network architecture, hyperparameters, and the number of classes.

Make sure to modify the following parameters in the configuration file:

batch and subdivisions: Adjust these values based on your hardware capabilities and dataset size.
max_batches: Set it to a value proportional to the number of classes, typically classes * 2000.
steps: Adjust the steps where learning rate decreases based on your dataset size.
filters and classes: Update these values to match the number of classes in your custom dataset.

2. Preparing the Pretrained Weights

To speed up training and improve performance, it is common to start with pretrained weights. YOLOv7 can be initialized with weights pre-trained on a large dataset such as COCO. Download the pretrained weights from the official Darknet website or use an available repository.

3. Training the YOLOv7 Model

With the dataset prepared and the model configured, it’s time to train the YOLOv7 model. Use a GPU-enabled machine for faster training. Execute the following steps:

Split your dataset into training and validation sets.
Divide the training set into batches.
Initialize the model with the pretrained weights.
Start the training process using a suitable optimizer like Stochastic Gradient Descent (SGD).
Monitor the loss function and make adjustments if needed.
Periodically evaluate the model’s performance on the validation set.

4. Fine-tuning and Hyperparameter Optimization

Training a YOLOv7 model is an iterative process. Experiment with hyperparameter settings like learning rate, momentum, and weight decay to improve the model’s accuracy and convergence speed. Additionally, you may employ techniques like data augmentation, adjusting anchor sizes, or implementing more advanced architectures like YOLOv4 or YOLOv5.

Train the custom yolov7 Model

We have now added our Pytorch optimization code in train.py, added custom data in yolov7/data, and created 3 folders train, test, val, and subfolders images, and labels.

We also added and fine-tuned our configuration files in --data data/custom_data.yaml --hyp data/hyp.scratch.custom.yaml --cfg cfg/training/yolov7-custom.yaml files

We also download the pre-trained weight from the yolov7 repo and saved it in our current directory

Now in Terminal let’s run the command to Train our model

Testing and Deployment

Once the YOLOv7 model has been trained, it’s time to test its performance on unseen data. Use the model to detect objects in new images or videos, and evaluate its accuracy and speed. Fine-tune the model further if necessary, based on the test results.

For deployment, consider converting the Darknet model to a format compatible with your target deployment environment, such as TensorFlow SavedModel or ONNX.

Conclusion

Training a YOLOv7 model on a custom dataset allows you to create a powerful object detection system tailored to your specific needs. By following the steps outlined in this guide, you can effectively collect and annotate a dataset, configure the model, and train it to achieve accurate and efficient object detection. Embrace the possibilities that YOLOv7 offers and unlock the potential of computer vision in your applications.