Image Processing with Python

Annotating Blood Cells using Python, OpenCV, and Numpy

Published in

Internet of Technology

4 min readFeb 3, 2024

This article will explore image processing with Python using OpenCV and Numpy. Sounds complex? Worry not! We’re making it super simple! Together, we’ll learn about annotating blood cells in images through these steps:

Setting up our environment
Parsing XML annotations
Drawing bounding boxes on images
Displaying annotated images

Step 1. Setting Up Our Environment

Before we begin coding, let’s ensure we have all the necessary tools. Install these packages to get started:

pip install opencv-python
pip install matplotlib
pip install numpy

Include these import statements in your script:

import cv2
import matplotlib.pyplot as plt
import xml.etree.ElementTree as ET
from numpy import ndarray

Here’s a brief overview of what each will do in our project:

cv2: For image processing tasks.
matplotlib.pyplot: For displaying images.
xml.etree.ElementTree: To parse XML files containing annotations.
ndarray: Just used as a type hint.

Step 2. Parsing XML Annotations

To annotate, we must first understand our data. We’ll parse XML files to extract annotations using a dataset like the blood cell count dataset found here. This step involves reading the positions and types of blood cells annotated in an image:

def parse_xml(xml_file: str) -> list[dict[str, dict[str, int]]]:
    """Parse an XML file to extract annotations for blood cells.

    :param xml_file: The path to the XML annotation file.
    :return: A list of dictionaries containing the name and bounding box
        coordinates for each annotated object.
    """
    tree = ET.parse(xml_file)
    root = tree.getroot()

    annotations = []
    for obj in root.iter("object"):
        annotations.append(
            {
                "name": obj.find("name").text,
                "bndbox": {
                    "xmin": int(obj.find("bndbox/xmin").text),
                    "ymin": int(obj.find("bndbox/ymin").text),
                    "xmax": int(obj.find("bndbox/xmax").text),
                    "ymax": int(obj.find("bndbox/ymax").text),
                },
            }
        )
    return annotations

This function transforms XML annotation data into a Python list that is ready for processing.

Step 3. Drawing Bounding Boxes

Next, we visualize the annotations by outlining each blood cell on the image with bounding boxes:

def draw_bounding_boxes(
    image: ndarray, annotations: list[dict[str, dict[str, int]]]
) -> ndarray:
    """Draw bounding boxes on an image according to provided annotations.

    :param image: The image on which to draw, as a numpy array.
    :param annotations: A list of dictionaries with annotation details.
    :return: The image with bounding boxes.
    """
    for ann in annotations:
        xmin, ymin, xmax, ymax = ann["bndbox"].values()
        color = (255, 0, 0) if ann["name"] == "WBC" else (0, 255, 0)
        cv2.rectangle(image, (xmin, ymin), (xmax, ymax), color, 2)
        cv2.putText(
            image,
            ann["name"],
            (xmin, ymin - 10),
            cv2.FONT_HERSHEY_SIMPLEX,
            0.5,
            color,
            2,
        )
    return image

This code will draw rectangles and labels on the image according to the annotations.

Step 4: Displaying Annotated Images

Finally, let’s display our annotated image:

def main(image_file: str, xml_file: str) -> None:
    """Main function for annotating blood cells in an image.

    :param image_file: A path to the image file.
    :param xml_file: A path to the XML annotation file.
    """
    annotations = parse_xml(xml_file)
    image = cv2.imread(image_file)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    image_with_boxes = draw_bounding_boxes(image, annotations)

    plt.figure(figsize=(8, 6))
    plt.imshow(image_with_boxes)
    plt.axis("off")
    plt.show()

    cell_counts = {}
    for ann in annotations:
        cell_type = ann["name"]
        cell_counts[cell_type] = cell_counts.get(cell_type, 0) + 1

    print(cell_counts)

We’ll load an image, apply our annotations, and use Matplotlib to display the annotated image.

Full Code

Here’s the complete script for annotating blood cells:

"""Annotate blood cells."""

import argparse
import xml.etree.ElementTree as ET
import cv2
import matplotlib.pyplot as plt
from numpy import ndarray


def parse_xml(xml_file: str) -> list[dict[str, dict[str, int]]]:
    """Parse an XML file to extract annotations for blood cells.

    :param xml_file: The path to the XML annotation file.
    :return: A list of dictionaries containing the name and bounding box
        coordinates for each annotated object.
    """
    tree = ET.parse(xml_file)
    root = tree.getroot()

    annotations = []
    for obj in root.iter("object"):
        annotations.append(
            {
                "name": obj.find("name").text,
                "bndbox": {
                    "xmin": int(obj.find("bndbox/xmin").text),
                    "ymin": int(obj.find("bndbox/ymin").text),
                    "xmax": int(obj.find("bndbox/xmax").text),
                    "ymax": int(obj.find("bndbox/ymax").text),
                },
            }
        )
    return annotations


def draw_bounding_boxes(
    image: ndarray, annotations: list[dict[str, dict[str, int]]]
) -> ndarray:
    """Draw bounding boxes on an image according to provided annotations.

    :param image: The image on which to draw, as a numpy array.
    :param annotations: A list of dictionaries with annotation details.
    :return: The image with bounding boxes.
    """
    for ann in annotations:
        xmin, ymin, xmax, ymax = ann["bndbox"].values()
        color = (255, 0, 0) if ann["name"] == "WBC" else (0, 255, 0)
        cv2.rectangle(image, (xmin, ymin), (xmax, ymax), color, 2)
        cv2.putText(
            image,
            ann["name"],
            (xmin, ymin - 10),
            cv2.FONT_HERSHEY_SIMPLEX,
            0.5,
            color,
            2,
        )
    return image


def main(image_file: str, xml_file: str) -> None:
    """Main function for annotating blood cells in an image.

    :param image_file: A path to the image file.
    :param xml_file: A path to the XML annotation file.
    """
    annotations = parse_xml(xml_file)
    image = cv2.imread(image_file)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    image_with_boxes = draw_bounding_boxes(image, annotations)

    plt.figure(figsize=(8, 6))
    plt.imshow(image_with_boxes)
    plt.axis("off")
    plt.show()

    cell_counts = {}
    for ann in annotations:
        cell_type = ann["name"]
        cell_counts[cell_type] = cell_counts.get(cell_type, 0) + 1

    print(cell_counts)


if __name__ == "__main__":
    parser = argparse.ArgumentParser(
        description="Annotate blood cells in an image given XML annotations."
    )
    parser.add_argument("image", type=str, help="Path to the image file.")
    parser.add_argument(
        "xml", type=str, help="Path to the XML annotation file."
    )

    args = parser.parse_args()

    main(args.image, args.xml)

Testing the Annotations

Now, let’s test it:

$ python image_analysis/annotate_blood_cells.py blood_cells/BloodImage_00000.jpg blood_cells/BloodImage_00000.xml

Which gives the following output:

Image Processing with Python

Annotating Blood Cells using Python, OpenCV, and Numpy

Step 1. Setting Up Our Environment

Step 2. Parsing XML Annotations

Step 3. Drawing Bounding Boxes

Step 4: Displaying Annotated Images

Full Code

Testing the Annotations

Further Reading

Machine Learning

Offered by Stanford University and DeepLearning.AI. #BreakIntoAI with Machine Learning Specialization. Master…

Written by Oliver Lövström