A Practical Guide to Toilet Seat Recognition Using ML.NET — 1
Automating Bathroom Amenities Detection
Introduction
In this series, we embark on a journey to leverage the power of machine learning with ML.NET for a specific, practical use case — detecting toilet seats in bathroom images.
This project isn’t just a technical exercise; it addresses a real-world need in various industries. By automating the detection of toilet seats in bathroom images, we can significantly streamline and enhance several operational processes.
Usefulness of Toilet Seat Detection
The automatic identification of toilet seats in images has practical applications in multiple sectors:
- Real Estate and Property Management
- Hospitality Industry
- Public Health and Sanitation Monitoring
- Retail and Inventory Management
Overview
Throughout this series, we will cover everything from setting up your ML.NET environment and preparing your dataset to training and evaluating your model. The focus will be on practical implementation, supplemented with code snippets and explanations, making it accessible for both beginners and those with some experience in .NET development.
By the end of this series, not only will you have a working model capable of detecting toilet seats in images, but you will also gain insights into the practical application of machine learning in real-world scenarios.
Let’s start with a practical guide to creating an image classification model using ML.NET to detect toilet seats in images. We’ll go step-by-step, focusing more on the code and essential explanations.
Setting Up the Project
First, we need to set up a new .NET console application and install the necessary ML.NET packages.
- Choose
Console App (.NET Core)
. - Name your project as
ToiletSeatDetection
. - Install ML.NET
NuGet
PackagesMicrosoft.ML
. - Install
Microsoft.ML.ImageAnalytics
andMicrosoft.ML.Vision
for image-related tasks. - ML.NET’s image classification relies on
TensorFlow
under the hood, so it’s essential to haveTensorFlow
set up correctly
<ItemGroup>
<PackageReference Include="Microsoft.ML" Version="3.0.0" />
<PackageReference Include="Microsoft.ML.ImageAnalytics" Version="3.0.0" />
<PackageReference Include="Microsoft.ML.Vision" Version="3.0.0" />
<PackageReference Include="SciSharp.TensorFlow.Redist" Version="2.16.0" />
</ItemGroup>
Preparing the Data
- Create two folders in your project:
images_with_toilet
andimages_without_toilet
. - Populate these folders with relevant images.
Create a Data Model
- In your project, create a new class
BathroomData.cs
. - Define a data model representing your input data.
- Also create a class for the model’s output
BathroomPrediction
.
public class BathroomData
{
public string ImagePath { get; set; }
public string Label { get; set; }
}
ImagePath
:
- Purpose: This property holds the file path of the image. It’s a string that tells the program where to find the image file it needs to process.
- Usage in ML.NET: In the context of ML.NET, this path is used to load the image into the model for training or prediction.
Label
:
- Purpose: The
Label
property is a string that represents the category or classification of the image. In your case, it would indicate whether the image is of a bathroom with a toilet seat ('with toilet') or without a toilet seat ('without toilet'). - Importance for Training: This property is crucial during the training phase of your machine learning model. ML.NET uses this label to understand what each image represents and learns to associate the visual features of the image with the corresponding category.
- Format: Typically, labels are simple and descriptive strings like ‘with toilet’ and ‘without toilet’. The key is consistency across your dataset.
public class BathroomPrediction
{
[ColumnName("PredictedLabel")]
public string PredictedLabel { get; set; }
}
Loading the Data
- In your
Program.cs
, start by creating aMLContext
. This is the starting point for all ML.NET operations.
using Microsoft.ML;
var mlContext = new MLContext();
- We need to load the images into a list that the ML.NET model can understand. Assuming you have images in two folders
images_with_toilet
andimages_without_toilet
, you can load them like this:
var images = new List<BathroomData>();
// Load images with toilet
foreach (var imagePath in Directory.GetFiles("images_with_toilet"))
{
images.Add(new BathroomData { ImagePath = imagePath, Label = "with toilet" });
}
// Load images without toilet
foreach (var imagePath in Directory.GetFiles("images_without_toilet"))
{
images.Add(new BathroomData { ImagePath = imagePath, Label = "without toilet" });
}
Prepare the Data
Split your dataset into training and testing data. ML.NET provides methods for this.
var dataView = mlContext.Data.LoadFromEnumerable(images);
var trainTestData = mlContext.Data.TrainTestSplit(dataView, testFraction: 0.2);
var trainingData = trainTestData.TrainSet;
var testData = trainTestData.TestSet;
Here’s what testFraction: 0.2
means:
- Test Fraction: This value represents the proportion of the dataset to be used for testing.
- 0.2: Specifically, 0.2 (or 20%) indicates that 20% of your entire dataset will be set aside for testing the model’s performance.
- Training vs Testing Data: Consequently, the remaining 80% of the data will be used for training the model.
Define the ML Pipeline
Build a pipeline for the ML task. This pipeline is a sequence of data transformations and a training algorithm for your image classification task. This include steps like loading images, resizing them, and applying a model for classification.
var pipeline = mlContext.Transforms.Conversion.MapValueToKey(inputColumnName: "Label", outputColumnName: "LabelAsKey")
.Append(mlContext.Transforms.LoadRawImageBytes(outputColumnName: "Image",
imageFolder: ".",
inputColumnName: nameof(BathroomData.ImagePath)))
.Append(mlContext.MulticlassClassification.Trainers.ImageClassification(
labelColumnName: "LabelAsKey",
featureColumnName: "Image"))
.Append(mlContext.Transforms.Conversion.MapKeyToValue("PredictedLabel", "PredictedLabel"));
Let’s see about the details of this pipeline.
MapValueToKey Transformation
mlContext.Transforms.Conversion.MapValueToKey(
inputColumnName: "Label",
outputColumnName: "LabelAsKey")
- Purpose: Converts the
Label
column (string values) into a numeric key type (LabelAsKey
) which is required for machine learning algorithms in ML.NET. - Why: Many ML algorithms work better with numeric data, so this step transforms your string labels (e.g., ‘with toilet’, ‘without toilet’) into a numeric form.
LoadRawImageBytes Transformation
.Append(mlContext.Transforms.LoadRawImageBytes(
outputColumnName: "Image",
imageFolder: ".",
inputColumnName: nameof(BathroomData.ImagePath)))
- Purpose: Loads the raw pixel values from the images into the pipeline. It takes the path of each image (from
BathroomData.ImagePath
) and loads the image as a byte array. imageFolder
: Specifies the base folder for the images. Here,"."
means the current directory.
ImageClassification Trainer
.Append(mlContext.MulticlassClassification.Trainers.ImageClassification(
labelColumnName: "LabelAsKey",
featureColumnName: "Image"))
- Purpose: Adds an image classification trainer to the pipeline.
- Parameters: The “Image” and “LabelAsKey” specify the input features (image data) and the labels for training the model.
- Why: This is the core machine learning algorithm that trains the model to classify images based on the input features and their corresponding labels.
MapKeyToValue Transformation
.Append(mlContext.Transforms.Conversion.MapKeyToValue(
"PredictedLabel", "PredictedLabel"))
- Purpose: Converts the predicted label keys back to their original string values.
- Why: After the model makes predictions, this step translates the numeric predictions back into understandable labels (e.g., ‘with toilet’, ‘without toilet’).
Conclusion
This setup prepares your project for the next steps, which include training the model, evaluating its performance, and eventually using it for predictions. Ensure that your environment is set up correctly and that the paths to your image folders are correct. Once you have this foundation in place, you’re ready to proceed with training and using your ML.NET model.
Next Part
I trust this information has been valuable to you. 🌟 Wishing you an enjoyable and enriching learning journey!
📚 For more insights like these, feel free to 👏 follow 👉 Merwan Chinta