Yes, You Can Deploy Edge AI Models With Privacy Preservation!

Published in

Labs Notebook

6 min readSep 11, 2023

In this blog, we are going to discuss privacy-preserving Edge AI work we have conducted in Accenture Labs Security R&D team in Washington DC. This work is part of the Trustworthy AI project, in which we have explored a wide range of topics, including Responsible AI, Robust AI, Explainable AI, and more. We have shared our insights in our recently published research papers, focusing on federated learning, adversarial training, privacy-preserving inference, and other exciting techniques.

What is Edge AI?

With recent advances in artificial intelligence (AI) and deep neural networks (DNNs) techniques, there has been a significant increase in sophisticated robotic and IoT applications leveraging AI with edge-cloud infrastructures. For those not already familiar with them by name, edge AI is a class of machine learning (ML) architecture that performs data processing and analysis on decentralized hardware devices that operate closer to sources of data. This approach can reduce latency, improve privacy, reduce bandwidth usage, and improve the performance of local AI applications.

Edge AI is a promising technology with great potential to revolutionize industries like self-driving cars, medical diagnostics, and industrial automation. In the future of edge AI, we can expect to see even more innovative and exciting applications, such as robots with embedded generative AI capabilities powered by large language models (e.g., GPT), which provide human-like responses to user prompts.

Major challenges and privacy concerns of Edge AI

Even though edge AI is a valuable technique, there are also a range of challenges to overcome for the practical deployment of edge devices and endpoint IoT devices, such as resource constraints and application-specific privacy restrictions.

Coupled with recent advances in ML, a noticeable trend is that AI models have become increasingly complex with the growing size of model parameters. As illustrated in Figure 1(a), edge devices are typically small, embedded systems with limited processing power, memory, and energy resources, making it difficult to handle such demanding AI models effectively. The resource limitations of edge devices hinder their ability to process and store large models, limiting their practicality for running complex DNNs locally.

Fig. 1. The challenges of the DNN deployment for an edge-cloud system.

Usually, offloading the computational burden to the cloud server is believed to be an alternative solution, where edge devices collect data, process it locally, and send queries to the cloud for inference. However, transferring data samples to the cloud poses privacy challenges. For instance, smart home cameras can identify human faces but may fail to meet legal and ethical privacy requirements if they upload sensitive face data to the vendor-managed cloud for AI-based facial recognition. Figure 1(b) highlights the inherent challenges of this scenario.

Make your Edge AI a breeze with model partitioning

In the world of Edge AI, model partitioning, also referred to as split learning (SL), is gaining popularity. SL involves splitting a complete DNN model into sub-networks and distributing them to multiple computation nodes during training. SL naturally distributes the model information (e.g., weights, bias, hyperparameters) and separates the training processes into two separate entities, which avoids raw data exchanges between the edge device and the cloud server.

**Fig. 2.** Memory usage and computation cost analysis of single layer and partial edge model after partitioning

In the context of addressing resource constraints in deploying deep learning models, let’s take a look at an example of the VGG 16 model in Figure 2(a). To optimize computational capability and memory usage, our approach first applies a per-layer analysis to determine the resource consumption of each layer in terms of floating-point operations (FLOPs) and memory. This analysis allows us to accumulate the FLOPs and memory usage for the two sub-networks designed for deployment on the edge device and the cloud server, as depicted in Figures 2(b) and 2(c). Lastly, we compare the total computation cost and memory usage of the edge device’s model portion with the resource capabilities of the target edge platform, ensuring that the split point meets the necessary resource requirements.

Edge AI deployment with privacy preservation

In addition to addressing resource constraints, another crucial aspect we prioritize is data privacy, which is achieved by integrating privacy-preserving techniques into the SL paradigm. We give an illustrative example as depicted in Figure 3. Here, we split a complete DNN model into two portions at the split point and distribute them to the edge device and cloud server, respectively. However, SL is not perfect, as it creates a potential privacy vulnerability because of exposed intermediate layer data, which might allow the attackers to reconstruct the input data by analyzing the data exchanges during edge-cloud communications.

Fig. 3. Privacy metric and task loss function calculation in model partitioning

In order to investigate the privacy risks, we apply distance correlation (DCOR) techniques introduced in this paper to quantitatively measure the ease of reconstructing input samples from the intermediate activation outputs at the split layer. Distance correlation is a normalized metric derived from distance covariance, which allows us to measure the dependence between two vectors of varying dimensions. Specifically, we calculate the DCOR between the raw inputs and intermediate activation outputs at the split layer for each batch as follows:

During our learning protocol, the DNN model parameters are updated not only by the prediction loss values but also by the data privacy measurements, which can be mathematically represented by the following loss function:

Where DCOR is the distance correlation metric, Ltk is the task loss of the distributed model (e.g., cross-entropy for a classification task), and y is a suitable label for the target task (if any). In the equation, the coefficients α1 and α2 define the relevance of distance correlation in the final loss function, creating and managing a tradeoff between data privacy (i.e., how much information an attacker can recover from the smashed data) and the model’s utility on the target task (e.g., inference accuracy). For attackers, this optimization increases the difficulty of reconstructing the original input samples from intermediate activation data.

Balancing Model Accuracy and Privacy Preservation

In our study, we explored how the placement of a split point in a DNN model impacts its resource requirements, data privacy, and inference accuracy. We empirically evaluate the effectiveness of our framework by presenting the training results of VGG16 on the CIFAR-10 dataset, focusing on different split points within the model. We calculate the DCOR before and after training, and the final inference accuracy on the testing dataset.

Our observations reveal two key findings: First, the split points influence the initial DCOR of a split model, with a decrease in the initial DCOR as more layers are shifted to the edge device. This is intuitive since additional layers between the intermediate outputs and raw input samples result in reduced statistical dependence. Second, despite varying computational profiles and resource consumption among partitioning strategies, inference accuracy remains stable across different split layers, with several split points even outperforming the baseline accuracy of conventional non-split configurations.

Scatter plot between inference accuracy and distance correlation for VGG16 trained with CIFAR-10 when splitting in each different layer of VGG16. — Fig. 5. Scatter plot between inference accuracy and distance correlation for VGG16 trained with CIFAR-10 when splitting in each different layer of VGG16

In Figure 5, we provide a visual representation that helps users determine the ideal split point by examining the trade-off between privacy protection and inference accuracy for VGG16 split learning on CIFAR-10. By assessing their specific accuracy and privacy requirements, users can effectively select the optimal split point in practice.

Summary

In this article, we have introduced an adaptive DNN partitioning framework that addresses the resource constraints and privacy concerns of edge devices while maintaining high accuracy. Our framework investigates the resource limitations and privacy requirements of the target edge platform to determine the split point for a DNN model, and thus fulfills the computational and privacy needs without loss of accuracy. We achieve this by designing an adaptive DNN partitioning strategy and integrating a distance correlation-based privacy metric into the model training. Furthermore, our approach can be seamlessly extended to various types of neural network models, accommodating both sequential and non-sequential architectures, setting it apart from conventional distance correlation approaches.

Further Resources

If you would like to know more about this work, you can read our complete paper published in ICONIP 2023.