Metehan Sarikaya
AIN311 Fall 2023 Projects
3 min readNov 26, 2023

--

Weed Classification for Robotic Agriculture — Week 2- Exploratory Data Analysis (EDA)

Welcome To Our Project’s Second Week Blog !!

By Ege Mert Gülderen

Welcome back to the ongoing discourse on ‘Weed Classification for Robotic Agriculture.’ In this segment, we embark on a detailed exploration of our datasets, sharing nuanced insights extracted from our thorough exploratory data analysis (EDA). Within the context of advancing technology in agriculture, we’ll illuminate the intricacies of machine learning applications geared toward fostering sustainability in agricultural practices.

Dataset Spotlight

DeepWeeds Dataset:

At the core of our project lies the ‘DeepWeeds’ dataset, a mosaic of 17,509 images capturing the essence of nine distinct weed classes. Each class, from the tenacious Chinee apple to the adaptable Siam weed, offers a unique glimpse into the challenges of weed classification.

Class distribution of DeepWeeds Dataset

Unveiling the class distribution reveals compelling patterns — some weeds stand out prominently, while others play a subtler role. Tackling these imbalances becomes our compass for targeted model training.

Sample images

Kaggle Weed Detection Dataset:

Focusing on weed occurrences in soybean crops, our curated dataset of 15,336 images mirrors real-world scenarios. Segments of soil, soybean, grass, and broadleaf weeds paint a vivid picture of agricultural diversity.

Recognizing the importance of these variations, we opted for a standardized approach. While the DeepWeeds dataset sports images of a fixed 255x255 size, the Kaggle dataset showcases variety. To harmonize these differences, we set the standardized size to 255x255 for all images, ensuring uniformity across our training data.

Image size for each class

Sample images from each class

Key EDA Insights:

Class Imbalances:

The distribution of weed classes within our datasets is not uniform. Some weed types wield more influence, while others are less common. Recognizing these imbalances guides our strategy, ensuring our model is finely tuned to distinguish both influential and lesser-known weed types during training.

Negative Class — Unwanted Seeds:

A pivotal addition to the DeepWeeds dataset is the ‘Negative’ class, representing unwanted seeds or weeds in agriculture. Identifying and classifying these instances is paramount for effective weed management. This unique dimension aligns seamlessly with our broader mission of promoting sustainable agriculture by minimizing the impact of undesirable elements.

The Road Ahead

As we transition from EDA to model implementation, these insights will serve as pillars guiding our strategies. The next blog post will unveil our machine learning model architecture, delve into training strategies, and present preliminary results. Stay tuned for the technological advancements propelling our mission to revolutionize weed management in agriculture

--

--

Metehan Sarikaya
AIN311 Fall 2023 Projects

Hi! I'm a senior AI student at Hacettepe University.I love working with Machine Learning and Data Science, where I play with data and find astonishing patterns.