Cloud vs On-device AI? Maybe something in between!
Put your AI on the cloud only when you need to put it on the cloud.
Cloud or on-device intelligence? Or maybe something in between? In this short article, we show how a simple collaborative approach between the mobile and cloud could save you offloading 50% of your precious data to the cloud while achieving the same cloud-level accuracy.
Cloud vs On-device AI: Pros and Cons
Let’s review the pros and cons of cloud vs on-device AI. On-device AI benefits us by providing better privacy and independence from remote resources but the model might not be very accurate due to the lack of powerful compute resources. On the other hand, cloud AI benefits us by providing more accurate models than those hosted on the mobile but requires a persistent network connection which can make it unreliable.
Use cloud AI only when you need to use cloud AI
The approach is based on putting a threshold on the confidence level of neural networks. If the network is confident with its prediction then we are fine, if not, we need to ask for help from a more intelligent cloud-hosted model! Let’s say you want to classify the 1000 objects of the ImageNet dataset. Assume we use MobileNetV2 as a lightweight model on the mobile and ResNext101 as our most accurate (but large!) model on the cloud. The approach is simple:
Hey mobile! Offload the inference to the cloud if you are not sure about your predictions!
We treat the softmax probabilities as the confidence level of the prediction and put a threshold on the confidence level of the mobile predictions. We use the torchvision pre-trained models on ImageNet which has 71.88% and 79.31% accuracy for MobileNetV2 and ResNext101, respectively.
Interpretation of the curve: If all the inference cases are performed on the cloud server, then the accuracy will be 79.31% which is the accuracy of ResNext101. If all the inference cases are performed on the mobile device, then the accuracy will be 71.88%. If 49.88% of the inferences cases are performed on the mobile and the rest on the cloud, then we can still achieve the cloud-level accuracy of 79.31%.
In a nutshell, we don’t always need a cloud server to add an AI to our applications. We can host small models on the weak devices and call the cloud servers in case of uncertainty in the predictions. Calibrating the confidence levels of neural networks can further increase the percentage of inference cases that are performed locally.
The author would like to thank Mohammad Saeed Abrishami for his help in this blog post.