Taming the Edges — Part 1

Published in

AI+ Enterprise Engineering

13 min readSep 10, 2020

Beautiful natural “Edges” can be found in Saxon Switzerland (Germany)

How to address the operational aspects of the ‘computing’ element in edge computing by utilizing an edge application manager.

This article not about taming some strands of hair on the edges of your hair cut! It’s about how to efficiently handle the deployment, the execution of services and applications (‘apps’) on the edge.

Before we dig deeper, let’s briefly discuss why edge computing is becoming more and more a focus topic as part of an overall journey to cloud strategy for many enterprise customers.

Adopting a hybrid cloud strategy by combining an easily accessible and scalable public cloud with some private cloud capabilities will provide enterprises with a very flexible and a typically very cost-effective IT landscape. Such a hybrid cloud setup normally needs a reliable network connection with good throughput and low network latency to allow ‘tear free’ access to the cloud hosted apps.

But how about use cases in rural areas which cannot rely on a reliable and fast network connection such like cable or 5G?

I am not even talking about network infrastructures in developing countries. Just by looking at many country sites in Germany (my home country) you sometimes must be lucky to get even a slow 3G network connection if there is no cable-based network access available.

Let’s imagine you are hiking with your latest generation smart phone through the Colorado mountains. You have already prepared yourself for this trip by downloading a precise offline map to guide you through the woods. Your phone is security protected by a built-in face recognition function.

Luckily your phone is an edge device with offline AI capabilities which allows you to unlock your phone with your smiling face to check if you are still on the right track w/o the need to be connected to a network.

Since you are hiking with your kids, you wisely decided to also download an app which allows you and your family to identify unknown and potentially poisonous plants (e.g. berries) and wild mushrooms by utilizing your phone’s AI co-processor in combination with the phone’s camera and a locally stored ML model trained for offline visual plant identification. If that setup would only rely on an active network connection into the cloud, the plant identification likely won’t work still considering that you are in a rural area and in return your and your family’s life could be at risk (if you are mistakenly eating a poisonous plant or mushroom).

If you noticed, the word ‘offline’ has been mentioned three times in the last three paragraphs.

So, one of the key edge computing advantages is to support use cases which can’t rely on the continuous availability of a network connection into a cloud environment and which might require at least partial offline operations.

Offline capabilities are just one of the edge computing advantages. Here are some additional strengths :

Low latency data processing which will result in an optimized user experience due to better response times. Even with a fast network connection like 5G, one can’t trick physics. The round-trip time for a request into a cloud is defined by the physical network distance (signal wise), request size (volume), the speed of light and the processing time in the cloud plus the return trip time.
Data privacy and security. Keeping sensitive data on an edge node can help to ease regulatory requirements by keeping data in certain geo boundaries or even within a specific building. Since an edge node could be so small that it could be potentially carried away (if not secured well enough). Therefore, data at rest on an edge node should be ideally encrypted.
Volatility and volume of data can be another strong argument for doing data processing on the edge. Quite often data is only relevant for analytics within the first few milliseconds of their existence. Also: why should one send huge volumes of raw data into a cloud for processing (e.g. machine learning inference of video data) if the same processing could be done close to the data source? Let’s talk about network bandwidth requirements reduction and (again) response time optimizations within that context.

Based on the edge computing strengths mentioned above and just by looking at the current technology trends and market developments, several industry analysts have predicted that large chunks of enterprise data (75%+) will be generated and processed on the edge by 2025 (+/-).

To be clear: that doesn’t mean that edge computing will be the end of the cloud as we know it, it’s really an extension of a hybrid cloud model down to the computing edges. Today’s central clouds will likely see a slight shift to become more supporting clouds for edge computing satellites.

So, every enterprise which is either already “cloudifying” its applications or which is considering starting its journey to cloud, should seriously investigate a more complete hybrid cloud strategy which not only focuses on cloud native apps, but also on edge native apps based on use cases.

Ideally soon hybrid cloud apps will be able to move transparently across cloud, private cloud, edge cloud and edge device boundaries.

What are typical edge nodes? If you are going to ask n edge computing experts, you likely going to get n different interpretations what an edge node defines.

In my ‘world’ and within the context of that blog article, I would like to use a simplified view of two edge node classes:

Edge devices and edge gateways, which have a limited amount of computing resources (whatever limited means these days), which might be portable and should support some means of network communication. Ideally such a device should run a common edge OS like Linux or Windows and even better should be able to run containerized applications/services. Your smart phone also falls into that category.
Edge clusters. That edge device class is very often associated with the terms fog computing and edge clouds. Edge clusters are more powerful (multiple CPU cores, larger amounts of RAM and storage) and hence will be able to run e.g. Kubernetes as their operating platform. Edge clusters can also be used for more heavy lifting analytics.

A while ago, I had a very ‘enlightening’ discussion with a leading manufacturer of light components if e.g. a light bulb can be already considered an edge device or not.

Within the group of people who was involved with that discussion we eventually agreed that a simple light bulb as most of us know, it probably doesn’t qualify for an edge device, but a light bulb with e.g. an integrated and programmable motion sensor would qualify (although it would be likely still below my own entry criteria for an edge device above).

Since we now already have covered edge node types and the rationale why edge computing is so appealing for some use cases, let’s have a look on how to handle the computing aspect in edge computing.

I would guess that an edge node w/o any application running would be pretty much useless.

In the following sections and in the next parts of my article series I will cover one option on how to manage applications on edge devices and edge clusters in more detail: a commercial implementation of The Linux Foundation Edge community Open Horizon platform called the IBM Edge Application Manager (IEAM).

The IEAM and hence also Open Horizon is built on the core principles of providing an open, autonomous, scalable and secure platform to create, deploy, run, secure, monitor, maintain and scale business logic applications & AI analytics across edge devices and edge clusters.

High-level overview of the IBM Edge Application Manager components

The IEAM management hub runs on Red Hat’s OpenShift platform and is hence available on any kind of clouds (all public and any private) which support Red Hat OpenShift.

The IEAM agent is a very low footprint service which can be either natively installed on an edge node or can be run containerized on edge nodes which doesn’t support or allow native agent installations on their OS. On an edge device the agent can be either accessed through a provided command line interface (CLI) or through the means of Kubernetes admin interfaces (on edge clusters).

Both, the management hub and the agent service provide, open, secure and well documented REST APIs which allow an easy integration into existing edge node and edge management frameworks.

The IEAM management hub also provides a Web based UI.

I already mentioned that edge applications which are managed by IEAM need to be containerized. For edge devices IEAM is currently supporting Docker containers, while for edge clusters applications are deployed via Kubernetes operators.

Let’s have a quick look at the main components of the IEAM management hub:

The Exchange Server manages the edge node registration, handles the edge service publications (aka the edge apps or edge micro services meta data) and keeps its system state in a PostgreSQL database.
The Agreement Bot (AgBot) negotiates the ‘agreements’ between each IEAM agent and the hub. During that process the AgBot checks and validates if an edge node is eligible to execute certain services.
The Model Manager provides an elegant way of updating and synchronizing machine learning (ML) models on an edge node w/o the need to completely redeploy a service on an edge node. That feature comes quite handy for AI on the edge use cases. The Model Manager maintain its models in a MongoDB database.
Finally the Switchboard provides secure P2P communication services between each IEAM agent and the management hub

And now let’s briefly cover the IEAM agent:

In addition to handle the all of the necessary steps in order to register an edge node with the management hub and to negotiate an agreement between a node and the hub in order to be able to execute a service on a node, the IEAM agent also monitors the execution of services on the edge node. So just in case a containerized service fails for any reason, the agent will try to restart that service automatically.

In that case the agent can behave completely autonomous if its operating offline.

A few words about container registries. IEAM manages applications on edge nodes, but it doesn’t maintain and it doesn’t store its related containers. One can basically use any kind of container registries like the public Docker hub registry, the IBM container registry, a private Red Hat OpenShift registry or any another registry.

Ok, enough theory, now let’s have a look at a real-world example.

The scenario: In my edge use case I would like to do some object classification based on a camera stream of single frame pictures. Basically, I will be doing AI on the edge.

Since I didn’t want to spend too much time developing such a use case from scratch, I came across the following cool open source object classification demo based on the very popular Yolo V3 framework developed by Glen Darling: https://github.com/MegaMosquito/achatina

That demo has been developed with IEAM and also LF Edge’s Open Horizon framework in mind.

In my setup I am using a Nvidia Jetson Nano DevKit with the IEAM/Open Horizon agent natively installed on the Nano’s Ubuntu Linux as my edge device. Attached to my Jetson Nano is a USB cam which I am using as the image stream source for the “achantia” demo.

My Nvidia Jetson Nano DevKit with an attached fan to keep the GPU and CPU cool

My IEAM management hub has been installed on a Red Hat OpenShift 4.3 cluster (ROKS) on the IBM public Cloud in Frankfurt, Germany.

The “achantia” demo consists out of five containerized services:

mqtt  provides a local MQTT broker on my edge node
restcam  grabs single images from either the USB cam or an URL
monitor  provides a web interface to the image classification framework
yolocuda  is the main app which triggers the image taking and the classification
restyolocuda  does the ‘heavy lifting’ for the image classification

The Docker container images for those services above are stored in the public Docker hub registry

If you are a more visual person and if you don’t want to read through the demo description below or if you would like to get a visual firsthand impression of that demo (in addition to my written description), please feel free to watch my short recording of that demo here:

Taming The Edges Part 1 video

In a first step I did register the basic services with the IEAM exchange server in a dedicated name space/tenant (“ceh-edge-41”). With the IEAM agent’s “hzn” CLI interface I can list all existing services:

hzn exchange service list[
“ceh-edge-41/yolocuda-meta_1.0.0_arm64”,
“ceh-edge-41/monitor_1.0.0_arm64”,
“ceh-edge-41/restyolocuda_1.0.0_arm64”,
“ceh-edge-41/restcam_1.0.0_arm64”,
“ceh-edge-41/mqtt_1.0.0_arm64”,
]

In a second step I am registering my Nvidia Jetson Nano as a new edge node with the IEAM instance. While doing so I am also providing some custom edge node properties specific to my device. Those custom properties can be then later used to create very flexible edge service deployment policies.

Examples for edge node properties could be any kind of location information (“is located at Golden Gate Bridge — North Side”), edge device hardware capabilities like e.g. “has a GPU”, “has 4 GB memory”, “ARM CPU” etc.

In my simple demo I am only setting two custom properties while registering my device, “openhorizon.allowPrivileged=true” and “location=WatsonCenterMunich”.

The first custom property allows the execution of docker containers in privileged mode.

hzn register --policy=./properties.json -f ./input-file-meta.jsonThe “properties.json” file:{
  "properties": [
    {
      "name": "openhorizon.allowPrivileged",
      "value": true
    },
    {
      "name": "location",
      "value": "WatsonCenterMunich"
    }
  ]
}The “input-file.json” file:{
  “services”: [
    {
      “org”: “ceh-edge-41”,
      “url”: “yolocuda-meta”,
      “variables”: {
        “CAM_URL”: “http://<your_edge_device_hostname_or_ip>:8888”
      }
    }
  ]
}

After registering my device, one can check the current device property settings via the “hzn policy list” command:

{
  "properties": [
    {
      "name": "openhorizon.allowPrivileged",
      "value": true
    },
    {
       "name": "location",
       "value": "WatsonCenterMunich"
    },
    {
       "name": "openhorizon.hardwareId",
       "value": "daccd0d45b09ca6cffb697cd0b54096fc6d79b30"
    },
    {
       "name": "openhorizon.cpu",
       "value": 4
    },
    {
       "name": "openhorizon.arch",
       "value": "arm64"
    },
    {
       "name": "openhorizon.memory",
       "value": 3956
    }
  ]
}

In addition to the two custom properties you will notice four additional properties. Those have been automatically provided by the IEAM agent which had been installed on the edge device.

At this point my edge node is known to the IEAM server instance and it already provides some heartbeat messages if it is connected to the server.

In order to trigger the deployment of the “yolocuda” app container and its related service containers one need to create a valid business deployment policy.

Here is the example deployment policy which I am using in my demo:

{
  “ceh-edge-41/policy-yolocuda-meta”: {
    “owner”: “ceh-edge-41/admin”,
    “label”: “Yolocuda Meta Deployment Policy”,
    “description”: “A sample Horizon Deployment Policy”,
    “service”: {
      “name”: “yolocuda-meta”,
      “org”: “ceh-edge-41”,
      “arch”: “arm64”,
      “serviceVersions”: [
        {
          “version”: “1.0.0”,
          “priority”: {},
          “upgradePolicy”: {}
        }
      ],
      “nodeHealth”: {}
    },
    “constraints”: [
      “location == SomeWhere”
    ],
    “created”: “2020–06–26T10:43:31.393Z[UTC]”,
    “lastUpdated”: “2020–08–04T10:45:02.992Z[UTC]”
  }
}

Please take note of the “constraints” property. All entries listed as part of the “constraints” property will be used by the IEAM server to decide which edge devices will qualify for a valid app/service deployment. In our demo case the “location” entry needs to match with a device’s location property.

So as long as the deployment policy’s “location” property equals “SomeWhere” nothing will happen on my edge device since its “location” property equals “WatsonCenterMunich”.

As soon as I change the “location” property in my deployment policy to match my device “location” property, IEAM starts to negotiate a service deployment agreement between the IEAM server and the IEAM agent.

Deployment policy constraints can be easily modified through the IEAM web UI (see the screenshot below).

As soon as a valid agreement has been created between the IEAM server and the IEAM agent on the edge device, all required service containers will be pulled from their respective container repositories and will be eventually executed on the edge device.

Here is an example for a successful deployment agreement:

hzn agreement list[
  {
    “name”: “Policy for ceh-edge-41/alexkoe-nvidia merged with ceh-edge-41/policy-yolocuda-meta”,
    “current_agreement_id”:  
“22396810ce7b07f0c807c24ff7a2945dbab62eff8afe42313f259de20d3e40ff”,
    “consumer_id”: “IBM/ceh-edge-41-agbot”,
    “agreement_creation_time”: “2020–08–19 18:50:24 +0200 CEST”,
    “agreement_accepted_time”: “2020–08–19 18:50:34 +0200 CEST”,
    “agreement_finalized_time”: “2020–08–19 18:50:43 +0200 CEST”,
    “agreement_execution_start_time”: “2020–08–19 18:50:55 +0200 CEST”,
    “agreement_data_received_time”: “”,
    “agreement_protocol”: “Basic”,
    “workload_to_run”: {
      “url”: “yolocuda-meta”,
      “org”: “ceh-edge-41”,
      “version”: “1.0.0”,
      “arch”: “arm64”
    }
  }
]

As soon as the services are up and running on the edge node, it doesn’t need any connection to the IEAM server anymore and hence can operate completely in offline mode.

The IEAM agent also acts as a local service watchdog on the edge node. If e.g. one of the service containers fails during execution the agent will try to restart that container automatically. That feature can be also combined with some service version rollback capabilities. Please refer to my embedded YouTube video for a simulated service recovery example.

To conclude my introductionary CEH blog article on IBM’s Edge Application Manager, let’s have quick look on how the “achantia” demo looks like when it’s up and running on my Nvidia Jetson Nano (see the screenshot below).

The web interface for the “achantia” demo — provided by the “monitor” service on the Nvidia Jetson Nano

The standard Yolo v3 model tries to classify objects based on some pre-trained object classes and puts labelled boxes around those objects. The Nvidia Jetson Nano’s GPU does a pretty good inferencing job for a $99 device (inferencing time for one image is around 0.1–0.2 seconds). Also considering that all demo services are either based on Python or shell scripts.

As a comparison: the typical inferencing time for the same demo on my MacBook Pro (CPU-only based inferencing) is around 1 second. On a Raspberry Pi 3 its around 30+ seconds.

To summarize: in the beginning of my article I started with some smart phone based personal edge computing use cases and guided you through a simple but still powerful AI edge computing example based on the IBM Edge Application Manager.

Now imagine what you could do at the edge if you expand some of those core edge computing principles like doing autonomous AI on the edge, implementing data sovereignty or dramatically reducing data processing latency times and all of this while operating in a partial or complete disconnected mode?

How about improving e.g. the worker safety by providing personal edge devices which can monitor not just your key body data, but also monitor the workplace environment for dangerous environmental conditions?
How about retrofitting all those aging and old bridges even in rural areas with some vibration and sound monitors to warn about potentially critical states?
In light of the ongoing tragic pandemic situation, how about anonymously monitoring sensitive areas like retail outlets, large gathering places or public transportation hotspots for social distancing measures and the correct wearing of face masks (where applicable)?
How about equipping even older manufacturing equipment with edge devices which can apply audio and/or video-based analytics to predict potential machinery failures?
How about….?

This time I will leave it up to you to come up with innovative edge computing use cases in order to break up those less and less existing hybrid cloud boundaries which will allow you to “tame” your enterprise computing edges!

Taming the Edges — Part 1

Written by Alexander Koerner