Building an IoT Solution using GCP

Nirav Kothari
GDGCloudMumbai
Published in
5 min readMay 21, 2021
Image by Forbes

The world is embracing on digitization by transforming the traditional business processes into smart ones. The sales and marketing is turning online, and the business operations are also going online. Be it a manufacturing industry, retail showroom or a commercial building operations, digital transformation has become essential for smart operations. The process of digital transformation generates a lot of data and that opens up a whole new world of opportunities for IT systems to handle it.

In this blog, I’m going to talk about Internet of Things that can help in digital transformation in industries like manufacturing, farming, heavy engineering, building automation, logistics etc. IoT is a network of connected devices (called things) embedded with sensors and necessary software to exchange the data over internet. This blog is written in the context of application of IoT in building automation. I am using Google Cloud Platform to implement the backbone. Equivalent services are also provided by AWS and Azure.

Fig. Implementation of IoT using GCP Services.

The figure above shows the implementation of IoT system using GCP services. While in the first glance it may look very complicated but believe me it is very straight forward and after going through this blog you would be able understand the function and the need of each and every block shown here. What you see here is the exhaustive list of services which one can use to implement full fledged IoT system, but you can always tailor it to your needs.

Devices

These are essentially a combination of hardware sensors and a software to periodically read and send it to data ingestion pipeline. The sensors keep sensing data at regular interval and sends this telemetry data in device to cloud communication. The parameter being sensed and the sensor being used, completely depends on the domain / use case. The software piece usually handles device management, data encryption, client libraries to connect to cloud services etc. In some cases it is also responsible for multiplexing, encoding and data compression.

Cloud IoT Core

This is a service provided by GCP to ingest the telemetry data in cloud. It acts like a gateway. It contains a protocol bridge which lets the on-site devices to connect and communicate through protocols like MQTT and HTTP. Data broker module then passes these messages to Cloud Pub/Sub for further distribution. IoT Core also maintains a device registry and implements authentication mechanism so that only designated devices can send in the data. It’s a fully managed, scalable, highly available service. It automatically manages the load using the load balancer.

Cloud Pub/Sub

Cloud Pub/Sub acts like a messaging queue / buffer in the pipeline. The messages received from Cloud IoT Core will be further distributed to processing pipeline. It’s a highly available, completely serverless, auto scaling and low latency service. It buffers the data for outages or compensate for rate differences, and ensures at-least-once delivery.

Cloud Dataflow

This service from GCP allows us to process the data. A single pipeline can allow processing streaming as well as batch data. The service can be used for data transformation as well as for streaming data analytics. Its an auto-scaling service and comes with default template to transfer data from Pub/Sub to BigQuery.

BigQuery

BigQuery can be leveraged to store the telemetry data from all the devices and possibly multiple sites. Its a peta-byte scale enterprise data warehouse service with very low cost storage and provides analytics capabilities on the stored data. It allows SQL based complex queries for historical analytics. Its secured, durable and highly available service.

Cloud ML / Cloud AI Platform

Once data is in BigQuery, apart from dashboards and historical analysis, the data can also be utilized in various ML based applications, for ex. predictive equipment maintenance, project future trends, detect anomalies etc. Cloud AI Platform is the service exactly meant for developing and maintaining such AI/ML applications. It allows you to train models with data present in BigQuery without transferring it. It also supports complete ML application life cycle along with collaboration within team. Its a fully managed, scalable, distributed service with features like auto hyperparameter tuning, collaboration and templates using AI Hub.

Update Device Config

A must-have feature for any IoT system is to be able to control the device remotely. This is usually provisioned by cloud-to-device communication. There are 2 possible reasons why you would want to control the device remotely. One is scheduled control, for ex. shutdown the device on weekends or run the device at a lower frequency/capacity during night time. The other is in response to preventive and predictive maintenance. In both the cases, IoT Core helps to send the updated configuration to the devices, and then devices act accordingly. Cloud Functions can be used to provision scheduled device config updates, whereas dynamic device config update initiates from data processing layer, either through Cloud Dataflow or Cloud AI Platform.

Key Challenges

  1. Security: Security is a critical concern when deploying and managing IoT devices. Security needs to be handled at all the stages of data pipeline. IoT Core provisions per-device authentication using limited duration JWT token which is signed through public/private key pair (specific to each device). Further data transfer between various services is encrypted and access is controlled through IAM. The data at rest is always encrypted on GCP. So, all in all, GCP has all the means to secure your data, we just need to configure it right way.
  2. Connectivity: If the data is being ingested into cloud over a cellular network, it is very important to provision for network outages. The devices should be configured to send the previous offline data whenever network resumes. This can be sent through regular IoT Core way or batch uploading to Cloud Storage. This calls for a batch data processing pipeline to eventually process and insert into BigQuery. Dataflow allows to leverage the same pipeline for streaming as well as the batch data.
  3. Dynamic Scaling: IoT applications usually start with small scale due to few clients / sites / devices and then it grows with its popularity. To cater to this it is important that we design system, which does not require any upfront investment, which is fully scalable and also handle periodic or seasonal variations. Cloud IoT Core, Cloud Pub/Sub, Dataflow and BigQuery are fully scalable services. So scaling can be easily handled with GCP.
  4. Actionable Insights: BigQuery provisions for historical analytics. Analytical and monitoring dashboards can be setup using Looker, Data Studio or any third party BI tool, for business insights. The data can be analyzed further using Datalab and also ML based analytics can be implemented using Cloud AI Platform. All in all GCP provides all the services to analyze data and draw actionable insights.

Conclusion

GCP offers a suite of services to help you implement end to end IoT data ingestion pipeline comprising of ingesting, processing, storing and analyzing. The good part about the pipeline is it will inherently have features like auto-scaling, auto-healing, auto-upgrade, load balancing, high availability, pay-per-use etc. Since you are utilizing managed services, it reduces the time to implement and hence time to market.

For further reading please refer https://cloud.google.com/architecture/iot-overview

--

--

Nirav Kothari
GDGCloudMumbai

#Developer #SolutionArchitect #NLP #ML #DataMining #IoT #Automation #GoogleCloud. Actively managing @GDG_Cloud_Mumbai