A Brief Introduction to Edge Computing and Deep Learning
For Deep Learning systems and applications, Edge Computing addresses issues with scalability, latency, privacy, reliability, and on-device cost.
Welcome to my first blog on topics in artificial intelligence! Here I will introduce the topic of edge computing, with context in deep learning applications.
This blog is largely adapted from a survey paper written by Xiaofei Wang et al.: Convergence of Edge Computing and Deep Learning: A Comprehensive Survey. If you’re interested in learning more about any topic covered here, there are plenty of examples, figures, and references in the full 35-page survey: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8976180.
Now, before we begin, I’d like to take a moment and motivate why edge computing and deep learning can be very powerful when combined:
Why is Deep Learning important?
Deep learning is becoming an increasingly-capable practice in machine learning that allows computers to detect objects, recognize speech, translate languages, and make decisions. More problems in machine learning are solved with the advanced techniques that researchers discover by the day. Many of these advanced techniques, alongside applications that require scalability, consume large amounts of network bandwidth, energy, or compute power. Some modern solutions have been around to address these concerns, such as parallel computing with a Graphics Processing Unit (GPU) and optical networks for communication. However, with an explosive field like deep learning finding new methods and applications, a entirely new field is being fueled to match and possibly surpass this demand. Introducing: Edge Computing.
Why is Edge Computing important?
Edge computing is a distributed computing paradigm that brings computation and data storage closer to the location where it is needed, to improve response times and save bandwidth. Take for example the popular content streaming service Netflix. Netflix has a powerful recommendation system to suggest movies for you to watch. It also hosts an extraordinary amount of content on its servers that it needs to distribute. As Netflix scales up to more customers in more countries, its infrastructure becomes strained. Edge Computing can make this system more efficient. We’ll begin with the two major paradigms within Edge Computing: edge intelligence and the intelligent edge.
More connected devices are being introduced to us by the day. Our phones, computers, tablets, game consoles, wearables, appliances, and vehicles are all gaining varying levels of intelligence — meaning they can communicate with other devices or perform computations to make decisions. This is edge intelligence. You might ask why this is important at all, but it turns out that as our products and services become more complex and sophisticated, new problems arise from latency, privacy, scalability, energy cost, or reliability perspectives. Use of edge intelligence is one way we can address these concerns. Edge intelligence brings a lot of the compute workload closer to the user, keeping information more secure, delivering content faster, and lessening the workload on centralized servers.
Much like edge intelligence, the intelligent edge brings content delivery and machine learning closer to the user. Unlike edge intelligence, the intelligent edge introduces new infrastructure in a location convenient to the end user or end device. Let’s take our Netflix example again. Netflix has its headquarters in California, but wants to serve New York City, which is almost 5000 kilometers away. Combine latency with the time it takes to compute a recommended selection of movies for the millions of users, and you’ve got a pretty subpar service. The intelligent edge can fix this!
Instead of having an enormous datacenter with every single Netflix movie stored on it, let’s say we have a smaller datacenter with the top 10,000 movies stored on it, and just enough compute power to serve the population of New York City (rather than enough to serve all of the United States). Let’s also say we’ll build 5 of these data centers — one for each borough of New York City (Manhattan, Brooklyn, Queens, Bronx, Staten Island) — so that the data center is even closer to the end user and if one server needs maintenance, we have backups. What have we just done? We’ve introduced new infrastructure, albeit with less power, but just enough to provide an even better experience to the end user than by using the most powerful systems centralized in one location. The idea of edge intelligence is scalable too, we can imagine this on a country-wide scale or on the scale as simple as a single warehouse.
What can we achieve if we combine the two?
For the remainder of this blog, we’ll dive a bit deeper into Edge Computing paradigms to get a better understanding of how it can improve our deep learning systems, from training to inference. I will also briefly introduce a paper that discusses an edge computing application for smart traffic intersection and use it as context to make the following concepts make more sense.
About COSMOS Smart Intersection:
From the paper Abstract: Smart city intersections will play a crucial role in automated traffic management and improvement in pedestrian safety in cities of the future. They will (i) aggregate data from in vehicle and infrastructure sensors; (ii) process the data by taking advantage of low-latency high-bandwidth communications, edge cloud computing, and AI-based detection and tracking of objects; and (iii) provide intelligent feedback and input to control systems. The Cloud Enhanced Open Software Defined Mobile Wireless Testbed for City-Scale Deployment (COSMOS) enables research on technologies supporting smart cities.
You can view the full paper here: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9156225.
Smart cities are perhaps one of the best examples to demonstrate the need and potential for edge compute systems. In this application for traffic intersections, we could imagine that there are some challenges to address as we move to a more autonomous future:
- We want to make crossing the street safer for pedestrians while more cars become self-driven. We can do this by adding computer vision systems to intersections to watch for potential collisions. But vehicles travel very fast, so a real-time vision system must have ultra-low latency.
- Large metropolitan cities easily have hundreds, if not thousands of intersections. If we wanted to add a vision system to some of them, a centralized compute system is more than likely to come across bottlenecks for data processing.
- Every intersections is going to look a bit different from another, could you really train one vision system to work seamlessly at each intersection? What if some intersections have a lot more leaves that fall during autumn? Or more snow builds up during the winter? Taller buildings casting darker shadows? It seems that we might have an interest in machine learning models that can be adapted to changing conditions.
- Constructions projects can happen any time, changing what an intersection looks like completely — in this case we may need to retrain our models, possibly while still performing inference. They say New York City is the city that never sleeps!
This hopefully stimulates some ideas for how state-of-the-art deep learning solutions have limitations in an application like smart cities. Read on to see how edge computing can help address these concerns!
Five essential technologies for Edge Deep Learning:
- Deep Learning Applications on the Edge: Technical frameworks for systematically organizing edge computing and DL to provide intelligent services
- Deep Learning Inference in in the Edge: Focusing on the practical deployment and inference of DL in the edge computing architecture to fulfill different requirements, such as accuracy and latency
- Edge Computing for Deep Learning: Which adapts the edge computing platform in terms of network architecture, hardware and software to support DL computation
- Deep Learning Training: Training DL models for edge intelligence at distributed edge devices under resource and privacy constraints
- Deep Learning for Optimizing the Edge: Application of DL for maintaining and managing different functions of edge computing networks (systems), e.g., edge caching, computation offloading
Now, let’s take a closer look at each one.
Deep Learning Applications on the Edge
Applications on edge comprise of hybrid hierarchical architectures (try saying that five times fast). This architecture is divided into three levels: end, edge, and cloud. Here’s an example from the paper demonstrating a real-time video analytic.
At the cloud level, we have our traditional large deep neural network (DNN). It also [alternatively] contains a majority of a network that is shared between the cloud and the edge. At the edge level, we have both a minority of the network shared with the cloud alongside a smaller, trained deep neural network. Finally, at the end level are our end devices. These can be sensors or cameras for collecting data, or a variety of devices for observing results and information. What’s important to note here is the collaboration between the cloud and the edge. Edge infrastructure lives closer to the end level. Recall that the edge has less compute capability, so hosting our large DNN there will likely give us poor performance. However, we can still host a smaller DNN that can get results back to the end devices quickly. We can additionally have an early segment of a larger DNN operating on the edge, so that computations can begin at the edge and finish on the cloud. This is a solution that would reduce latency by removing the bottleneck at the edge level, and reducing propagation delay to the cloud level.
Deep Learning Inference in in the Edge
The best DNNs require deeper architectures and larger-scale datasets — thus indirectly requiring cloud infrastructure to handle the computational cost associated with deep architectures and large quantities of data. This requirement limits the ubiquity of the deployment of deep learning services, however.
The figure above depicts an inference system entirely on the edge — no cloud at all! Let’s break it down. Our camera system on the far left is placed near a common pedestrian walkway, let’s say it’s to help us find a child that was separated from their parent. To accomplish this task, our DNN must be capable of detection of humans as well as recognition, to make sure we find the right child (their parent shares a photo so we know what to look for). Sounds like a job for the cloud, right? What if, instead, we used an edge platform specifically for finding the Region-of-Interest (RoI). This yields a much smaller space that we need for object recognition, now that less-relevant parts of our image have been removed. We can feed the reduced search space to a second edge platform that performs the inference for matching the child in the photo provided. Therefore, this distributed system successfully completes the same task that normally would be allocated to the cloud. The ability to deploy a system like this dramatically increases the potential for system deployment in places further away — or completely disconnected — from the cloud!
Edge Computing for Deep Learning
So far we’ve talked about how we can stretch a DNN architecture across cloud, edge, and end devices. This is only one direction we can approach for deploying edge computing systems. Truly, the design, adaptation, and optimization of edge hardware and software are equally important. We will discuss that in this section.
- Edge Hardware for DL: Designing an edge node will depend on whatever metrics matter most for an application, this can lead to the decisions one would make with regards to choosing to run on CPU, GPU, FPGA, or ASIC hardware.
- Communication and Computation Modes for Edge DL: There are a variety of ways we could go about communication modes for our edge system. We may need to define a threshold in which compute workload must offload from off the edge and to the cloud (such as during high network bandwidth usage). We may alternatively choose for several edge nodes to cooperate together on a task.
- Tailoring Edge Frameworks for DL: Commonly-used frameworks like Tensorflow and Py(Torch) work well, though their models developed in them may not be executable on the edge. Therefore frameworks themselves can be optimized to adapt to different configurations that best suit the available compute resources and data.
- Performance Evaluation for Edge DL: No standard testbench for edge DL exists at the time of writing — but to fully realize the potential of edge systems (as well as their combination with cloud and end devices), end-to-end metrics will be appropriate, especially for vehicular scenarios.
Deep Learning Training
A variety of concerns may rise regarding training. For example, for real time training applications, aggregating data in time for training batches may incur high network costs. For different applications, merging data could violate privacy issues. While niche, these legitimate concerns justify an exploration into end-edge-cloud systems for deep learning training.
Distributed training has been around for some time. It can be traced back to a proposed edge computing solution to solve a large scale linear regression problem in the form of a decentralized stochastic gradient descent method.
This figure shows two examples of a distributed training network. On the left, it is the end devices that train models from local data, with weights being aggregates at an edge device one level up. On the right, training data is instead fed to edge nodes that progressively aggregate weights up the hierarchy.
Federated Learning: Federated Learning (FL) is also an emerging deep learning mechanism for training among end, edge, and cloud. Without requiring uploading data for central cloud training, Federated Learning can allow edge devices to train their local DL training, Federated Learning can allow edge devices to train their local DL models with their own collected data and upload only the updated model instead.
As depicted in the figure below, FL iteratively solicits a random set of edge devices to 1) down- load the global DL model from an aggregation server (use “server” in following), 2) train their local models on the down- loaded global model with their own data, and 3) upload only the updated model to the server for model averaging.
Federated learning can address several key challenges in edge computing networks: Non-IID training data, limited communication, unbalanced contribution, and privacy and security. See section 7, subsection B in the paper for more details for how FL achieves this.
Deep Learning for Optimizing the Edge
DNNs (general DL models) can extract latent data features, while DRL can learn to deal with decision-making problems by interacting with the environment. With regard to various edge management issues such as edge caching, offloading, communication, security protection, etc., 1) DNNs can process user information and data metrics in the network, as well as per- ceiving the wireless environment and the status of edge nodes, and based on these information 2) DRL can be applied to learn the long-term optimal resource management and task schedul- ing strategies, so as to achieve the intelligent management of the edge, viz., or intelligent edge.
This leaves significant room for open-endedness — where we can apply DNNs or DRL for resource management such as caching (i.e. reducing redundant data transmissions), task offloading, or maintenance. See the attached table from the paper to see how this may be used.
Lessons Learned and Open Challenges
This blog covers use cases of edge computing for deep learning at a surface level, highlighting many applications for deploying deep learning systems as well as applications for metrics and maintenance. If these ideas resonated with you, you might agree that this opens the avenue for more deep learning applications like self-driving cars or cloud-based services like gaming or training DNNs entirely offline for research purposes.
With the rising potential of edge computing and deep learning, the question is also raised as to how should we go about measuring performance of these new systems or determining compatibility across the end, edge, and cloud:
- Generalization of EEoI (Early Exit of Inference) — We dont always want to be responsible for choosing when to exit early. This is an area deep reinforcement learning can explore.
- Hybrid model modification — our cloud model may need to be pruned to run on end nodes or end devices. Can it still perform after that?
- Coordination between training and inference — Consider a deployed traffic monitoring system has to adjust for after road construction, weather/changing seasons. It must retrain itself. Training is expensive, so how can it be coordinated with inference once the roads open again?
On top of this, the introduction of edge hardware comes with its own unique challenges. Some examples are:
- Edge for Data Processing
- Microservice for Edge Deep Learning Services
- Incentive and Trusty offloading mechanism for Deep Learning
- Integration with DL for optimizing Edge
Lastly (and before the details get too confusing!), we might also be interested in practical training principles at edge. The idea here is that want to have standards for how our system trains. A lot of these questions are open-ended, meaning a lot of solutions can (and cannot) be used to address the problem. Some questions that can serve as examples to ponder on: Where does training data coming from? Say we want to deploy Federated Learning model. To do this, that means the cloud is not a delegator of data. Is data collected on site? Are edge nodes communicating over the blockchain? There may be synchronization issues because of edge device constraints (i.e. power), and so on.
Thank you for checking out my blog! I hope you learned something new, and I hope you learned something useful. Leave a comment if this blog helped you! Also, feel free to connect with me on LinkedIn: http://linkedin.com/in/christophejbrown
 X. Wang, Y. Han, V. C. M. Leung, D. Niyato, X. Yan and X. Chen, “Convergence of Edge Computing and Deep Learning: A Comprehensive Survey,” in IEEE Communications Surveys & Tutorials, vol. 22, no. 2, pp. 869–904, Secondquarter 2020, doi: 10.1109/COMST.2020.2970550. Link: https://ieeexplore.ieee.org/document/8976180
 S. Yang et al., “COSMOS Smart Intersection: Edge Compute and Communications for Bird’s Eye Object Tracking,” 2020 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops), Austin, TX, USA, 2020, pp. 1–7, doi: 10.1109/PerComWorkshops48775.2020.9156225. Link: https://ieeexplore.ieee.org/document/9156225