Image Processing with Python
Annotating Blood Cells using Python, OpenCV, and Numpy
This article will explore image processing with Python using OpenCV and Numpy. Sounds complex? Worry not! We’re making it super simple! Together, we’ll learn about annotating blood cells in images through these steps:
- Setting up our environment
- Parsing XML annotations
- Drawing bounding boxes on images
- Displaying annotated images
Step 1. Setting Up Our Environment
Before we begin coding, let’s ensure we have all the necessary tools. Install these packages to get started:
pip install opencv-python
pip install matplotlib
pip install numpy
Include these import statements in your script:
import cv2
import matplotlib.pyplot as plt
import xml.etree.ElementTree as ET
from numpy import ndarray
Here’s a brief overview of what each will do in our project:
cv2
: For image processing tasks.matplotlib.pyplot
: For displaying images.xml.etree.ElementTree
: To parse XML files containing annotations.ndarray
: Just used as a type hint.
Step 2. Parsing XML Annotations
To annotate, we must first understand our data. We’ll parse XML files to extract annotations using a dataset like the blood cell count dataset found here. This step involves reading the positions and types of blood cells annotated in an image:
def parse_xml(xml_file: str) -> list[dict[str, dict[str, int]]]:
"""Parse an XML file to extract annotations for blood cells.
:param xml_file: The path to the XML annotation file.
:return: A list of dictionaries containing the name and bounding box
coordinates for each annotated object.
"""
tree = ET.parse(xml_file)
root = tree.getroot()
annotations = []
for obj in root.iter("object"):
annotations.append(
{
"name": obj.find("name").text,
"bndbox": {
"xmin": int(obj.find("bndbox/xmin").text),
"ymin": int(obj.find("bndbox/ymin").text),
"xmax": int(obj.find("bndbox/xmax").text),
"ymax": int(obj.find("bndbox/ymax").text),
},
}
)
return annotations
This function transforms XML annotation data into a Python list that is ready for processing.
Step 3. Drawing Bounding Boxes
Next, we visualize the annotations by outlining each blood cell on the image with bounding boxes:
def draw_bounding_boxes(
image: ndarray, annotations: list[dict[str, dict[str, int]]]
) -> ndarray:
"""Draw bounding boxes on an image according to provided annotations.
:param image: The image on which to draw, as a numpy array.
:param annotations: A list of dictionaries with annotation details.
:return: The image with bounding boxes.
"""
for ann in annotations:
xmin, ymin, xmax, ymax = ann["bndbox"].values()
color = (255, 0, 0) if ann["name"] == "WBC" else (0, 255, 0)
cv2.rectangle(image, (xmin, ymin), (xmax, ymax), color, 2)
cv2.putText(
image,
ann["name"],
(xmin, ymin - 10),
cv2.FONT_HERSHEY_SIMPLEX,
0.5,
color,
2,
)
return image
This code will draw rectangles and labels on the image according to the annotations.
Step 4: Displaying Annotated Images
Finally, let’s display our annotated image:
def main(image_file: str, xml_file: str) -> None:
"""Main function for annotating blood cells in an image.
:param image_file: A path to the image file.
:param xml_file: A path to the XML annotation file.
"""
annotations = parse_xml(xml_file)
image = cv2.imread(image_file)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image_with_boxes = draw_bounding_boxes(image, annotations)
plt.figure(figsize=(8, 6))
plt.imshow(image_with_boxes)
plt.axis("off")
plt.show()
cell_counts = {}
for ann in annotations:
cell_type = ann["name"]
cell_counts[cell_type] = cell_counts.get(cell_type, 0) + 1
print(cell_counts)
We’ll load an image, apply our annotations, and use Matplotlib to display the annotated image.
Full Code
Here’s the complete script for annotating blood cells:
"""Annotate blood cells."""
import argparse
import xml.etree.ElementTree as ET
import cv2
import matplotlib.pyplot as plt
from numpy import ndarray
def parse_xml(xml_file: str) -> list[dict[str, dict[str, int]]]:
"""Parse an XML file to extract annotations for blood cells.
:param xml_file: The path to the XML annotation file.
:return: A list of dictionaries containing the name and bounding box
coordinates for each annotated object.
"""
tree = ET.parse(xml_file)
root = tree.getroot()
annotations = []
for obj in root.iter("object"):
annotations.append(
{
"name": obj.find("name").text,
"bndbox": {
"xmin": int(obj.find("bndbox/xmin").text),
"ymin": int(obj.find("bndbox/ymin").text),
"xmax": int(obj.find("bndbox/xmax").text),
"ymax": int(obj.find("bndbox/ymax").text),
},
}
)
return annotations
def draw_bounding_boxes(
image: ndarray, annotations: list[dict[str, dict[str, int]]]
) -> ndarray:
"""Draw bounding boxes on an image according to provided annotations.
:param image: The image on which to draw, as a numpy array.
:param annotations: A list of dictionaries with annotation details.
:return: The image with bounding boxes.
"""
for ann in annotations:
xmin, ymin, xmax, ymax = ann["bndbox"].values()
color = (255, 0, 0) if ann["name"] == "WBC" else (0, 255, 0)
cv2.rectangle(image, (xmin, ymin), (xmax, ymax), color, 2)
cv2.putText(
image,
ann["name"],
(xmin, ymin - 10),
cv2.FONT_HERSHEY_SIMPLEX,
0.5,
color,
2,
)
return image
def main(image_file: str, xml_file: str) -> None:
"""Main function for annotating blood cells in an image.
:param image_file: A path to the image file.
:param xml_file: A path to the XML annotation file.
"""
annotations = parse_xml(xml_file)
image = cv2.imread(image_file)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image_with_boxes = draw_bounding_boxes(image, annotations)
plt.figure(figsize=(8, 6))
plt.imshow(image_with_boxes)
plt.axis("off")
plt.show()
cell_counts = {}
for ann in annotations:
cell_type = ann["name"]
cell_counts[cell_type] = cell_counts.get(cell_type, 0) + 1
print(cell_counts)
if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Annotate blood cells in an image given XML annotations."
)
parser.add_argument("image", type=str, help="Path to the image file.")
parser.add_argument(
"xml", type=str, help="Path to the XML annotation file."
)
args = parser.parse_args()
main(args.image, args.xml)
Testing the Annotations
Now, let’s test it:
$ python image_analysis/annotate_blood_cells.py blood_cells/BloodImage_00000.jpg blood_cells/BloodImage_00000.xml
Which gives the following output:
Further Reading
If you want to learn more about programming and, specifically, machine learning, see the following course:
Note: If you use my links to order, I’ll get a small kickback. So, if you’re inclined to order anything, feel free to click above.