The seeing Raspberry Pi

Published in

Global AI Community

8 min readJan 20, 2020

Welcome to the Seeing Pi! In this article we are going to create a custom vision model (ONNX) and run it on a Raspberry Pi that runs Windows IoT to detect objects and show the result on a display.

We start with building a model using Azure Custom Vision, which lets you create a vision model. This model you can export in different formats. We will export the model in the ONNX format.

After we have created the model we build a basic UWP application (Universial Windows Platform) that grabs frames from a camera and runs it through the model. The result is displayed first in the UWP app, and later on we add a SPI display to it.

Finally, we connect the wires on the Raspberry Pi 3, and deploy our program to it.

Let’s get started!

Part 1 — Create your model

In this first part we are going to create a model using the using Azure Custom Vision, a service that lets you create a vision model. You can export this model in different formats; we will export the model as ONNX.

You can follow the tutorial below or download a basic ONNX model that can detect apples and bananas.

1.1 Create your dataset

Find 2 objects in the room like your watch, phone, mouse, cup or any thing that classifies as an object.
Make at least 15–30 photos from these objects. Make sure you use different angles and backgrounds.
From the 15–30 pictures per object, set 1 per object aside for testing your model.

1.2 Create a new project

Go to the website: customvision.ai and login with a Microsoft account, the one that has access to an Azure Subscription (Get started for free here).
Click on Create a new project.
If you don’t have an Resource Group holding a custom vision endpoint, you can create a new one here.

1.3 Upload & Tag your pictures

After creating your project you can start uploading you images. The best way is to upload and tag your images per project.

Click “Add Images”
Click “Browse some local files” (Can not exceed 4MB per image)
Tag the images, type the tag name
Click the upload button
Repeat these steps for all your objects

Finally, add a “negative class”

Download a set of random images here
Click add images
Select in the tag dropdown “Negative”

1.4 Train your model

Now it is time to train your custom vision model.

Click in the top right on the green “Train” button
The training will take only a few seconds. When the training is done you will see the performance and recall of the model. This should be above 90%; if not, you have to take some better pictures 😉

Click on publish and give the “Publish name” the value “latest”

Have a look at this screen and see that here you can find the API Endpoint and export the model

1.5 Test your model

Now you have trained the model it is time to test it. Click the “Quick Test” button in the top right, upload an image and see the results!

1.6 Export the model

If your model works with your test set of images it is time to export the model to run in offline.

Open the tab “Performance”
Click on “Export”
Download the “ONNX Model”

Part 2 — Build the UWP App

In this part we are going to build a basic UWP application (Universial Windows Platform) that grabs frames from a camera and runs it through the model. We display the results first in the UWP app, and later on we add a SPI display to it.

You can follow the tutorial below or clone the repository containing the code.

2.1 Create the app

In Visual Studio 2019:

File > New Project
Select: Visual C# > Windows Universal > Blank App (Universal App)
Select: Build 17763 (or higher)

2.2 The camera

2.2.1 Enable the Camera

Open the “Package.appxmanifest” file
Open the tab: “Capabilities”
Check the checkbox “Webcam & Microphone”

2.2.2 Showing the camera feed

Open the file: “MainPage.xaml”
Add the code below between the “grid” tags:

<StackPanel>
   <TextBlock x:Name="StatusText" FontWeight="Bold" TextWrapping="Wrap" Text="...."/>
   <CaptureElement Name="PreviewControl" Stretch="Uniform"/>
</StackPanel>

Open the file: “MainPage.xaml.cs
Add this code to the class: “MainPage”

private readonly DisplayRequest _displayRequest = new DisplayRequest();private readonly MediaCapture _mediaCapture = new MediaCapture();private async Task StartVideoPreviewAsync()
{
   await _mediaCapture.InitializeAsync();
   _displayRequest.RequestActive();PreviewControl.Source = _mediaCapture;
   await _mediaCapture.StartPreviewAsync();
}

Call the StartVideoPreviewAsync method from the constructor
Run the application and validate you can see the camera feed

2.3 ONNX Model

2.3.1 Import the model

Rename the .onnx file you have downloaded in the previous step to “mycustomvision.onnx”
Copy the customvisionmodel.onnx file to the Assets folder
Go to Solution Explorer in Visual Studio
Right click on the Assets Folder > Add > Existing Item > Select the “mycustomvision.onnx” file and click add.
In the properties from the “mycustomvision.onnx” set:
Build Action: Content
Copy to output: Copy if newer
Verify that you have a new file in the root of your project called: “mycustomvision.cs”
Replace the content of this file with the following:

using System;
using System.Collections.Generic;
using System.Threading.Tasks;
using Windows.Media;
using Windows.Storage;
using Windows.AI.MachineLearning;namespace TheSeeingPi
{
    public sealed class MyCustomVisionModelInput
    {
        public VideoFrame Data;
    }public sealed class MyCustomVisionModelOutput
    {
        public TensorString ClassLabel = TensorString.Create(new long[] { 1, 1 });
        public IList<IDictionary<string, float>> Loss = new List<IDictionary<string, float>>();
    }public sealed class MyCustomVisionModel
    {
        private LearningModel model;
        private LearningModelSession session;
        private LearningModelBinding binding;
        public static async Task<MyCustomVisionModel> CreateFromStreamAsync(StorageFile stream)
        {
            MyCustomVisionModel learningModel = new MyCustomVisionModel();
            learningModel.model = await LearningModel.LoadFromStorageFileAsync(stream);
            learningModel.session = new LearningModelSession(learningModel.model);
            learningModel.binding = new LearningModelBinding(learningModel.session);
            return learningModel;
        }
        public async Task<MyCustomVisionModelOutput> EvaluateAsync(MyCustomVisionModelInput input)
        {
            binding.Bind("data", input.Data);
            var result = await session.EvaluateAsync(binding, "0");
            var output = new MyCustomVisionModelOutput();
            output.ClassLabel = result.Outputs["classLabel"] as TensorString;
            output.Loss = result.Outputs["loss"] as IList<IDictionary<string, float>>;
            return output;
        }
    }
}

2.3.2 Load the model

Open the file: “MainPage.xaml.cs
Add this code to the class: “MainPage”

private string _modelFileName = "mycustomvision.onnx";private MyCustomVisionModel _model = null;private async Task LoadModelAsync()
{
   await Dispatcher.RunAsync(CoreDispatcherPriority.Normal, () => StatusText.Text = $"Loading {_modelFileName}");var modelFile = await StorageFile.GetFileFromApplicationUriAsync(new Uri($"ms-appx:///Assets/{_modelFileName}"));
   _model = await MyCustomVisionModel.CreateFromStreamAsync(modelFile);await Dispatcher.RunAsync(CoreDispatcherPriority.Normal, () => StatusText.Text = $"Loaded {_modelFileName}");
}

Call the LoadModelAsync method from the constructor
Run the application and validate that the model is loaded

2.4 Analyze the camera feed

2.4.1 Grabbing the frames from the camera

Open the file: “MainPage.xaml.cs
Add this code to the class: “MainPage”

private readonly SemaphoreSlim _frameProcessingSemaphore = new SemaphoreSlim(1);private ThreadPoolTimer _frameProcessingTimer;public VideoEncodingProperties VideoProperties;

Add this lines to the “StartVideoPreviewAsync” method

TimeSpan timerInterval = TimeSpan.FromMilliseconds(66); //15fps
_frameProcessingTimer = ThreadPoolTimer.CreatePeriodicTimer(new TimerElapsedHandler(ProcessCurrentVideoFrame), timerInterval);
VideoProperties = _mediaCapture.VideoDeviceController.GetMediaStreamProperties(MediaStreamType.VideoPreview) as VideoEncodingProperties;

Add this method:

private async void ProcessCurrentVideoFrame(ThreadPoolTimer timer)
{
   if (_mediaCapture.CameraStreamState != Windows.Media.Devices.CameraStreamState.Streaming || !_frameProcessingSemaphore.Wait(0))
   {
       return;
   }try
   {
       using (VideoFrame previewFrame = new VideoFrame(BitmapPixelFormat.Bgra8, (int)VideoProperties.Width, (int)VideoProperties.Height))
       {
           await _mediaCapture.GetPreviewFrameAsync(previewFrame);// Evaluate the image
           await Dispatcher.RunAsync(CoreDispatcherPriority.Normal, () => StatusText.Text = $"Analyzing frame {DateTime.Now.ToLongTimeString()}");}
   }
   catch (Exception ex)
   {
       Debug.WriteLine("Exception with ProcessCurrentVideoFrame: " + ex);
   }
   finally
   {
       _frameProcessingSemaphore.Release();
   }
}

Run the application and validate that every second a frame is analyzed

2.4.2 Scoring the frames

Open the file: “MainPage.xaml.cs
Add this code to the class: “MainPage”

private async Task EvaluateVideoFrameAsync(VideoFrame frame)
{
   if (frame != null)
   {
       try
       {MyCustomVisionModelInput inputData = new MyCustomVisionModelInput
           {
               Data = frame
           };
           var output = await _model.EvaluateAsync(inputData);var product = output.ClassLabel.GetAsVectorView()[0];
           var loss = output.Loss[0][product];
           var message = string.Join(",  ", product + " " + (loss * 100.0f).ToString("#0.00") + "%");await Dispatcher.RunAsync(CoreDispatcherPriority.Normal, () => StatusText.Text = message);
           Debug.WriteLine(message);// Insert Lines for SPI Display here
       }
       catch (Exception ex)
       {
           Debug.WriteLine($"error: {ex.Message}");
       }
   }
}

In the “ProcessCurrentVideoFrame” method replace:

// Evaluate the image
await Dispatcher.RunAsync(CoreDispatcherPriority.Normal, () => StatusText.Text = $"Analyzing frame {DateTime.Now.ToLongTimeString()}");

With

await Task.Run(async () =>
{
   await EvaluateVideoFrameAsync(previewFrame);
});

Run the application and validate that you see the classification of every frame, you can hold the objects in front of the camera an see if it is working.

Your application should look like this now.

Part 3 — Run it on the RaspBerry Pi 3

To run this application on a Raspberry Pi you need the following hardware:

Raspberry Pi 3 — Model B
Raspberry Pi 2.5A power Amazon | SosSolutions (NL)
Monochrome 1.3" 128x64 OLED graphic display
Pi-Dish voor Raspberry Pi Inclusief Breadboard or Breadboard Only
USB Camera with Microphone Microsoft Lifecam

Follow this guide to setup Windows IoT on your raspberry

3.1 Display the result on the SPI display

Add the SPIDisplay module to your project. View module
Open the file: “MainPage.xaml.cs
Add this code to the class: “MainPage”

private readonly SPIDisplay _spiDisplay = new SPIDisplay();

Add the follow lines to the constructor:

_spiDisplay.InitAll();

Replace “// Insert Lines for SPI Display here” in the “EvaluateVideoFrameAsync” method with:

_spiDisplay.WriteLinesToScreen(new List<string>{message});

3.2 Connect the display to the Raspberry Pi

Connect all the wires exactly the same as in the schema below.

Don’t forget to remove the power

3.3 Run it on de device

Select by debug the ARM profile
Select “Device”
Type the IP address of your Raspberry Pi
Select by protocol “Windows Universal”
Click select
Click the green play button to debug your solution on the Pi

The first time it can take a while to deploy, so this is a good time for some coffee!

Now we are ready to go! Put your object in front of the camera and see on the display the outcome of your model.