Computing on the EDGE

Published in

Analytics Vidhya

7 min readDec 15, 2020

Most of the companies in today’s era are moving towards cloud for their computation and storage needs. Cloud provides a one shot solution for all the needs for services across various aspects, be it large scale processing, ML model training and deployments or big data storage and analysis. This again requires moving data, video or audio to the cloud for processing and storage which also has certain shortcomings compared to do it at the client like

Network latency
Network cost and bandwidth
Privacy
Single point failure

If you look at other side, cloud have their own advantages and I will not talk about them right now. With all these in mind, how about a hybrid approach where few requirements can be moved to the client and some remain on the cloud. This is where EDGE computing comes into picture. According to Wiki here is the definition of the same

“Edge computing is a distributed computing paradigm that brings computation and data storage closer to the location where it is needed, to improve response times and save bandwidth”

Edge has a lot of use cases like

Trained ML models (specially video and audio) siting closer on the edge for inferencing or prediction.
IoT data analysis for large scale machines right at the edge

Look at Gartner hype cycle for emerging technologies. Edge is gaining momentum.

There are many platforms in the market specialised in edge deployments right from cloud solutions like azure iot hub, aws greengrass .., open source like kubeedge, edgeX-Foundary and third party like Intellisite etc.

I will focus this article on using one of the platforms for building an “Attendance platform” on the edge using facial recognition. I will add as many links as possible for your references.

Let us start with taking the first step and defining the requirements

Capture video from the camera
Recognise faces based on trained ML model
Display the video feed with recognised faces on the monitor
Log attendance in a database
Collect logs and metrics
Save unrecognised images to a central repository for retraining and improving model
Multi site deployments

Choosing a platform

Choosing the right platform from so many options was a bit tricky. For the POC, we looked at few pieces in the platform

Pricing
Infrastructure maintenance
Learning curve
Ease of use

There were other metrics as well but these were on top of our mind. Azure IoT looked pretty good in terms of above evaluation. We also looked at Kubeedge which provided deployments on Kubernetes on the edge. It is open source and looked promising. Looking at many components (cloud and edge) involved with maintenance overhead, we decided not to move ahead with open source. We were already using Azure cloud for other cloud infra, this also made our work a little more easier in choosing this platform. This also helped

Designing the solution

Azure IoT hub provided 2 main components. One is the cloud component responsible for managing the deployments on edge and collection of data from them. The other is the edge component consisting of

Edge Agent : manages deployment and monitoring of modules on the IoT Edge device
Edge Hub : handles communications between modules on the IoT Edge device, and between the device and IoT Hub.

I will not go into the details, you can find more details here about the Azure IoT edge. To give a brief, Azure edge requires modules as containers which can to be pushed to the edge. The edge device first needs to be registered with the IoT Hub. Once the Edge agent connects with the hub, you can push your modules using a deployment.json file. The container runtime that Azure Edge uses is moby.

We used Azure IoT free tier which was sufficient for our POC. Check the pricing here

As per the requirements of the POC, this is what we came up with

The solution consists of various containers which are deployment on the edge as well as few cloud deployments. I will talk about each components in details as we move ahead.

As part of the POC, we assumed 2 sites where attendance needs to be taken at multiple gates. To simulate, we created 4 ubuntu machine. This is the ubuntu desktop image we used. For attendance, we created a video containing still photos of few filmstars and sportsperson. These videos will be used for attendance in order to simulate the cameras, one for each gate.

Modules in action

Camera module

It captures IP camera feed and pushed the frames for consumption

It uses python opencv for capture. For the POC, we read video files pushed inside the container.
Frames published to zeromq (brokerless message queue).
Used python3-opencv docker container as base image and pyzmq module for mq. Check this blog on how to use zeromq with python.

The module was configured to use a lot of environment variables, one being sampling rate of the video frames. Processing all frames require high memory and CPU, so it is always advisable to drop frames to reduce cpu load. This can be done in either camera module or inferencing module.

Inference Module

Used a pre-existing face recognition deep learning model for our inferencing needs.
Trained the model with easily available filmstars and sportsperson images.
The model was not trained with couple of images which were present in the video to showcase undetected image use case. These undetected images were stored in ADLS gen2, explained in the storage module.
Python pyzmq module was used to consume frames published by the camera module.
Not every frame was processed and few frames were dropped based on the configuration set via environment variables.
Once an image was recognised, a message (json) for attendance was send to the cloud using IoT Edge hub. Use this to specify routes in your deployment file.

Display Module

The processed frames from inference module are passed to the display module to show it on screen. There were few challenges in-order to access display port from a container. Again opencv was used for the display. Below is the command that is used when running a container to access the display port on a linux machine.

docker run --privileged -e DISPLAY=${DISPLAY} -v /tmp/.X11-unix:/tmp/.X11-unix <image>

I will show the deployment file later on how this is passed via Edge agent.

Storage Module

Unrecognised images from the inference module were stored in this container. This module act as a local storage and uploads all these images to the cloud (ADLS gen2 storage) based on when to be sent to cloud as per the configuration. We used pre-created azure blob storage container for our needs azure-blob-storage:1.3-linux-amd64 . You can follow this and this for more details.

Logging Module

Logging module comprised us to deploy cloud components as well. Since we already had a deployment of grafana and loki we thought of using them as our logging infrastructure.

Docker provides logging-driver to send container logs to the driver. Grafana-loki uses promtail agent. Docker officially still doesn’t support this driver so we used fluentd as the logging driver. Check here for details.
Fluentd container is configured to send all the container logs to grafana-loki. Again necessary tags are send in each container logs to differentiate between different deployments and containers in logs.
Grafana and loki were installed using helm charts on AKS ( Azure kubernetes ).
New storage class in AKS (azure file) was created for persistence storage to be used by loki for logs storage.

Attendance in SQL

Inference module once detects a frame, sends a message to the IoT hub on the cloud
This message as json is stored in the eventhub which can be configured with a retention period.
A python consumer running in kube cluster, parses the message and stores it in Azure mysql service.

I will not go into the details of the consumer. It is a python eventhub consumer reading data from the eventhub topic and writing to mysql.

Deployment

It is very easy to manage and deploy all these solutions using the deployment file where you can define the modules for the edges, environment variables, docker run configuration etc. Azure provides templated deployment file. Here is a snippet of the display module in the deployment file

"display": {
   "version": "1.0",
   "type": "docker",
   "status": "running",
   "restartPolicy": "always", 
   "settings": {
       "image": "${MODULES.display}",
       "createOptions": {
           "Env": [
              "DISPLAY=:0"
           ],
       "HostConfig": {
          "Binds": [
            "/tmp/.X11-unix:/tmp/.X11-unix"
          ],
          "Privileged": true
       }
    }}}

Again, the portal provides a lot of options to configure and manage the deployments using the UI. The deployments were also very fast and the machine was up and running in the matter of minutes after deployment once the edge agent was installed.

We did not experiment this on a windows machine. LCOW ( Linux containers on windows ) support has been recently added in the Windows OS. You can read more about this here.

I think there is a huge potential of edge in the future and specially with the deep learning models becoming lighter and smarter, it all becomes easy to push them to the edge with low processing needs.