Computer vision has made tremendous strides in recent years, thanks to deep learning models like YOLO (You Only Look Once). YOLO is known for its exceptional real-time object detection capabilities and has become a popular choice for various applications. In this article, we will delve into the process of optimizing the yolov7 model using Intel Pytorch Optimization, then training a YOLOv7 model on a custom dataset, and empowering you to create your own powerful object detection system.
What is yolov7?
YOLOv7 is an extension of the YOLO series of object detection models. It combines the best features from previous versions while introducing improvements in terms of accuracy and speed. YOLOv7 is built on the Darknet framework and utilizes a single neural network to simultaneously predict object-bounding boxes and their associated class probabilities.
Cloning the Yolov7 Repo, and Installing the Required Dependencies/Packages
git clone https://github.com/WongKinYiu/yolov7
cd yolov7
pip install virtualenv
virtualenv venv
venv/Scripts/activate
pip install -r requirements.txt
How to fine-type the existing yolov7 code to add Intel® Extension for PyTorch?
Intel® Extension for PyTorch* extends PyTorch* with up-to-date features optimizations for an extra performance boost on Intel hardware. Optimizations take advantage of AVX-512 Vector Neural Network Instructions (AVX512 VNNI) and Intel® Advanced Matrix Extensions (Intel® AMX) on Intel CPUs as well as Intel Xe Matrix Extensions (XMX) AI engines on Intel discrete GPUs. Moreover, through PyTorch* xpu
device, Intel® Extension for PyTorch* provides easy GPU acceleration for Intel discrete GPUs with PyTorch*.
Intel® Extension for PyTorch* provides optimizations for both eager mode and graph mode, however, compared to eager mode, graph mode in PyTorch* normally yields better performance from optimization techniques, such as operation fusion. Intel® Extension for PyTorch* amplifies them with more comprehensive graph optimizations. Therefore we recommend you to take advantage of Intel® Extension for PyTorch* with TorchScript whenever your workload supports it. You could choose to run with torch.jit.trace()
function or torch.jit.script()
function, but based on our evaluation, torch.jit.trace()
supports more workloads so we recommend you to use torch.jit.trace()
as your first choice.
The extension can be loaded as a Python module for Python programs or linked as a C++ library for C++ programs. In Python scripts users can enable it dynamically by importing intel_extension_for_pytorch
.
- Check CPU tutorial for detailed information of Intel® Extension for PyTorch* for Intel® CPUs. Source code is available at the master branch.
- Check GPU tutorial for detailed information of Intel® Extension for PyTorch* for Intel® GPUs. Source code is available at the xpu-master branch.
Installation
CPU version
python -m pip install intel_extension_for_pytorch
python -m pip install intel_extension_for_pytorch -f https://developer.intel.com/ipex-whl-stable-cpu
GPU version
python -m pip install torch==1.13.0a0+git6c9b55e intel_extension_for_pytorch==1.13.120+xpu -f https://developer.intel.com/ipex-whl-stable-xpu
Modifying yolov7 code to add Intel® Extension for PyTorch
Inference on CPU
import torch
import torchvision.models as models
model = models.resnet50(pretrained=True)
model.eval()
data = torch.rand(1, 3, 224, 224)
import intel_extension_for_pytorch as ipex
model = ipex.optimize(model)
with torch.no_grad():
model(data)
Inference on GPU
import torch
import torchvision.models as models
model = models.resnet50(pretrained=True)
model.eval()
data = torch.rand(1, 3, 224, 224)
import intel_extension_for_pytorch as ipex
model = model.to('xpu')
data = data.to('xpu')
model = ipex.optimize(model)
with torch.no_grad():
model(data)
Optimizing the yolov7 training (train.py) code to use Intel® Extension for PyTorch
Collecting and Annotating the Dataset
To train a YOLOv7 model on a custom dataset, you need a substantial amount of labeled images. The first step is to collect images relevant to your target object detection task. Ensure that the dataset is diverse, containing various backgrounds, lighting conditions, and angles.
Next, you must annotate the dataset by labeling each object of interest in the images. Popular annotation tools include LabelImg, RectLabel, and VIA. Annotating involves drawing bounding boxes around objects and assigning corresponding class labels. Aim for accurate and consistent annotations, as they directly impact the model’s performance.
LabelImg
pip install labelImg
labelImg
- Dataset Preparation
Once your dataset is annotated, it needs to be prepared in a format compatible with YOLOv7. YOLOv7 requires data in the Darknet format, which consists of two files: .txt
and .names
.
The .txt
file should accompany each image in the dataset and contain the ground truth annotations in the following format:
<class_label> <x_center> <y_center> <width> <height>
Each value is normalized by the image dimensions, with the class label being an integer index.
The .names
file contains a list of class labels, with each label on a separate line.
- Configuring the YOLOv7 Model
The next step is to configure the YOLOv7 model for training. The configuration file, yolov7.cfg
, defines various parameters such as network architecture, hyperparameters, and the number of classes.
Make sure to modify the following parameters in the configuration file:
batch
andsubdivisions
: Adjust these values based on your hardware capabilities and dataset size.max_batches
: Set it to a value proportional to the number of classes, typicallyclasses * 2000
.steps
: Adjust the steps where learning rate decreases based on your dataset size.filters
andclasses
: Update these values to match the number of classes in your custom dataset.
2. Preparing the Pretrained Weights
To speed up training and improve performance, it is common to start with pretrained weights. YOLOv7 can be initialized with weights pre-trained on a large dataset such as COCO. Download the pretrained weights from the official Darknet website or use an available repository.
3. Training the YOLOv7 Model
With the dataset prepared and the model configured, it’s time to train the YOLOv7 model. Use a GPU-enabled machine for faster training. Execute the following steps:
- Split your dataset into training and validation sets.
- Divide the training set into batches.
- Initialize the model with the pretrained weights.
- Start the training process using a suitable optimizer like Stochastic Gradient Descent (SGD).
- Monitor the loss function and make adjustments if needed.
- Periodically evaluate the model’s performance on the validation set.
4. Fine-tuning and Hyperparameter Optimization
Training a YOLOv7 model is an iterative process. Experiment with hyperparameter settings like learning rate, momentum, and weight decay to improve the model’s accuracy and convergence speed. Additionally, you may employ techniques like data augmentation, adjusting anchor sizes, or implementing more advanced architectures like YOLOv4 or YOLOv5.
Train the custom yolov7 Model
We have now added our Pytorch optimization code in train.py, added custom data in yolov7/data, and created 3 folders train, test, val, and subfolders images, and labels.
We also added and fine-tuned our configuration files in --data data/custom_data.yaml --hyp data/hyp.scratch.custom.yaml --cfg cfg/training/yolov7-custom.yaml files
We also download the pre-trained weight from the yolov7 repo and saved it in our current directory
Now in Terminal let’s run the command to Train our model
Testing and Deployment
Once the YOLOv7 model has been trained, it’s time to test its performance on unseen data. Use the model to detect objects in new images or videos, and evaluate its accuracy and speed. Fine-tune the model further if necessary, based on the test results.
For deployment, consider converting the Darknet model to a format compatible with your target deployment environment, such as TensorFlow SavedModel or ONNX.
Conclusion
Training a YOLOv7 model on a custom dataset allows you to create a powerful object detection system tailored to your specific needs. By following the steps outlined in this guide, you can effectively collect and annotate a dataset, configure the model, and train it to achieve accurate and efficient object detection. Embrace the possibilities that YOLOv7 offers and unlock the potential of computer vision in your applications.