IoT on Google Cloud at scale

Nicholas Ord
Google Cloud - Community
4 min readNov 15, 2018

How to verify IoT costs before major investment

1. Hardware for $10?

2. Secure TLS 1.2 or 1.3?

3. Unix timestamps & PTP atomic clocks?

4. Cloud OPEX under 10 cents/month?

Iteration 1

What is the absolute minimum cloud and hardware needed to get clean data from industrial installations?

ARM M3 is feasible (see pic) but we needed to wipe the flash (solder jumper on the reverse side) and re-do code in C and assembler. TLS 1.2 ok but contacted ARM Holdings for getting TLS 1.3 into the M3 Cortex portfolio.

No room left for edge analytics — but 3 dumb sensors work fine. Legacy WiFi chips have no SNI; connecting to hot spots in the field with timeouts is useless. For 1Mio devices; total cloud OPEX 5 cents/month per device.

First Iteration ARM M3 devices, second Iteration ARM M4 with sensors and R Pi Zero

Iteration 2

Build a data science lab for industrial IoT

Most sensors on top side of blue board, about the size of a matchbox

12 sensors with filters on board (FFT, FiR, LMS) can measure 3 phase electricity at Nyquist sampling rates. 8 billion rows of data (n1-highmem-96), each data point hashed.

On the lab bench: ST Micro, Panasonic, Silicon Labs, Bosch, AMS, Microchip then shrunk to one board stacked onto a R Pi Zero.

OPEX cloud costs

Costs

$570/month with BigQuery. After data science phase, BigQuery is scaled back just for lab use. Prod costs $410 a month. Costs prove to be linear or less (2 VMs handle 2000 boards nicely). Markov, k-Means can later run directly in the pipe (no need to store), just extract data value at source.

Data scientist: “its like a digital combine harvester!”

Crucial for data science is latency — each message gets 3 timestamps to correlate with other sensors or device clusters in different parts of the world at the same moment
830Mio rows of accelerometer data points were collected within a few weeks from sites in USA, EU and Asia

Data scientists get massive computing power on demand at low cost and never need to download data. Devices stream from locations around the world with low latency, synced to NTP or PTP.

1 day’s data of IoT under stress testing

Iteration 3

Remove R Pi Zero, integrate micro server onto SoC, do system hardening. Scale to 100Mio data points per day. Add hardware security accelerator.

Lessons learnt in second iteration:

1. Doing data science with 9 billion rows of data every month needs power to analyse — even using hi-mem GPU VMs. Harvest machine learning labels for later transfer to Edge TPUs.

2. Try provisioning like this for keys and this for tokens. Data redaction and tokenisation can be planned upfront.

3. Verify each business case at >$5 profit in the first year per unit. Predictive maintenance for factories, real time insurance adjustments for industrial sites.

4. Certified industrial modules are $110 in production and allow retrofit of factories and other installations in under five minutes per unit. They can be adapted quickly to accept interfaces from OEM products with little change to the cloud (cloud “publishes” whatever it gets; you could use Kafka / Lambda on AWS but it is a lot more complicated to configure and secure).

5. Once machine learning migrates to the edge, data rates drop to 10kbps up-link. More details of field testing unit here.

I can recommend the lab approach: Pharmaceutical companies work with raw ingredients from jungles for years without first knowing exactly what they will be able to synthesise into medical products.

It is no different for data — yet the industry still expects miracles from bad data with business cases upfront. If 3 star chefs have wonderful kitchens (platforms) but the raw ingredients are rubbish, they cannot cook.

“It seems the hardware engineers at industry levels don’t include the future customers — the data scientists” Senior Data Scientist.

Tried this on Azure / AWS — failed on cost (see also HFT guy views) latency, ease of implementation, lack of machine learning, security. Google does data as its core business with 3 billion Android phones in real time. Maybe that’s why?

--

--