Convert any dataset to COCO object detection format with SAHI

Fatih Cagatay Akyon
Codable
Published in
2 min readJan 3, 2022

After reading this post, you will be able to easily convert any dataset into COCO object detection format 🚀

  1. Install sahi :
pip install sahi

2. Import required classes:

from sahi.utils.coco import Coco, CocoCategory, CocoImage, CocoAnnotation
from sahi.utils.file import save_json

3. Init Coco object:

coco = Coco()

4. Add categories (starting from id 0):

coco.add_category(CocoCategory(id=0, name='human'))coco.add_category(CocoCategory(id=1, name='vehicle'))

5. Create a Coco image:

coco_image = CocoImage(file_name="image1.jpg", height=1080, width=1920)

You can get image sizes without loading into memory by Pillow:

from PIL import Imagewidth, height = Image.open("image1.jpg").size

6. Add annotations to Coco image:

coco_image.add_annotation(    CocoAnnotation(    bbox=[x_min, y_min, width, height],    category_id=0,    category_name='human'    ))coco_image.add_annotation(    CocoAnnotation(    bbox=[x_min, y_min, width, height],    category_id=1,    category_name='vehicle'    ))

7. Add Coco image to Coco object:

coco.add_image(coco_image)

8. After adding all images, export Coco object as COCO object detection formatted json file:

save_json(data=coco.json, save_path=save_path)

That’s all ✔️

Bonus 1 🎁 xView to COCO conversion script: https://github.com/fcakyon/sahi-benchmark/blob/main/xview/xview_to_coco.py

Bonus 2 🎁 VisDrone to COCO conversion script: https://github.com/fcakyon/sahi-benchmark/blob/main/visdrone/visdrone_to_coco.py

--

--