Why You Can’t Google The Internet of Things

“With this massive influx of data many companies will have no idea what hit them.” — Bill Briggs, CTO, Deloitte Consulting

The Internet of Things is an industry that after several decades of trial-and-error, hit the top of Gartner’s hype cycle curve just this year. If you believe in this kind of forecasting, then the IoT is set to start a rapid dive on the negative slope of this curve before — hopefully — one day pulling out of the nosedive and into a smooth upward glide path.

If there is an actual IoT nosedive — in hype or reality — the autopsy will likely say it was caused by wireless. Today’s wireless IoT technologies spew too much data, can’t be queried in real-time, burn too much energy, and if left unchecked, will cause a data tsunami will make the slope of Gartner’s hype curve seem quite real.

Just how serious is the risk of an IoT data tsunami?

  • By 2017 half of IT networks will go from having a network capacity surplus to being network constrained, and 10 percent of sites will be overwhelmed. Storage vendors, on the other hand, love the IoT status quo.
  • 86% of manufacturers in US and Europe already report major increases in shop-floor data collection over the past two years; 62% are not sure if they have been able to keep up with larger data volumes.
  • Working overtime to feed the tsunami, battery-powered endpoints lose battery life prematurely, resulting in higher maintenance and replacement costs, something that catches IoT newbies by surprise. Data overload also causes data centers to work harder and burn more electricity in the process.
  • Security and privacy weaknesses abound in IoT networks but wireless connectivity is a notorious hacker’s hangout. About 90 percent of all IT networks will have an IoT-based security breach within two years.

Put bluntly, if we keep doing what we are doing now, much of the IoT we are all anxiously awaiting is going to suck.

“We live in a real world where bandwidth is neither infinite nor free. There’s a lot of data being generated. We talk about 50 billion sensors by 2020. If you look today at all the sensors that are out there, they’re generating 2 exabytes of data. It’s too much data to send to the cloud. There’s not enough bandwidth, and it costs too much money.” — Todd Baker, OIX product management head @ Cisco

Today’s Wireless Is Killing the IoT

To pinpoint the origins of the data tsunami, go to the source: the wireless IoT endpoint. Wireless endpoints are essentially a computer chip, a radio, an antenna, one or more sensors, some memory, and a power supply. Endpoints can be very tiny or quite large and can be standalone or integrated with other devices, but all share these basic attributes. The purpose of an endpoint is simple: to sense its environment and report to the network as programmed.

The radio in an endpoint can employ one of a number of low power wireless communications protocols yet most protocols were created to be a sort of WiFi-lite. As a result, nearly all are oriented around pushing data (remember Pointcast?) — to the network rather than endpoints standing by for intelligent queries to pull data, a la Google Search.

“Not only has the demand for capacity on our wireless networks been accelerating significantly, but it’s been accelerating in a non-scalable way,” says Charles Golvin, analyst @ Forrester Research.

That none of today’s protocols seriously contemplated a Google-like future for the IoT is astonishing. Searching in real-time for objects in the Internet of Things using simple or complex queries — and thereby pushing the maximum amount of data cleansing and analysis to the edge of the network — didn’t make anyone’s priority list 10–20 years ago?

Unfortunately, as the data tsunami is upon us, technologies like Bluetooth, 6lowPAN, ZigBee and others are utterly unfit to the task of the next phase of the IoT.

Where The Tsunami Is Happening

We can usually categorize wireless IoT technology deployments into about three network models:

Commercial and Industrial Networks. These make up the majority of IoT connections in the world, yet a potpourri of wireless IoT technologies is deployed here with most designed long before the realities of Hadoop, ARM-based silicon, smartphones, and other innovations that — combined — make extraordinary IoT forecasts more believable. Well-meaning technologies like ZigBee or 6lowPAN hog bandwidth in dense environments, drain endpoint batteries too quickly, and don’t offer the real-time query capability at the edge of the network that the IoT needs. In many cases, these technologies are programmed to beacon their status every few milliseconds as a way to fake a real-time query capability, which wreaks havoc on the downstream network. To their credit, cellular carriers have made good strides in re-using their LTE networks for IoT applications, but LTE’s high cost and short battery life limit its potential to mains-powered applications like vending machines, so their role in the tsunami is relatively limited. Newer technologies like LPWAN’s (you might have heard of SigFox or LoRa) seem likely to prevail due to range that is often comparable to LTE while utilizing less costly hardware and consuming power consumption at a fraction of LTE’s rate. LPWAN’s at the moment suffer from a one-way beaconing syndrome where data can only be sent every few minutes, though help is on the way.

“IoT deployments will generate large quantities of data that need to be processed and analyzed in real time … leaving providers facing new security, capacity and analytics challenges.” — Fabrizio Biscotti, research director @ Gartner.

Private Personal Networks. Your FitBit uses Bluetooth beacons that send accelerometer data to your smartphone (or anything nearby) every few milliseconds. FitBit is a major success, but non-stop beaconing is a dumb way to send data to your smartphone or usually anything else. Far better to batch the data and transmit less frequently — ideally when an authorized user queries it — resulting in major battery life gains for your FitBit and your smartphone. And since Bluetooth beacons are unencrypted, batching data makes it harder for stalkers and NSA types to do their spy work. Oh, and this wearables market is projected to be at $53 billion in four years, which combined with oodles of other Bluetooth gadgets, is its own data tsunami.

Public Networks. Exciting but mostly in the pilot or planning phases, these include smart cities that make real-time parking data available to anyone who asks. Public or semi-public lost-and-found networks for everything from stolen bicycles to lost Alzheimer’s patients. Roadways equipped with environmental and other sensors to be shared with vehicles to ease congestion, reduce fuel consumption, and improve safety. For public networks using licensed spectrum (e.g. cellular carriers), the data tsunami is capped by the practical limits of the technology. For public networks using unlicensed spectrum (like LPWAN’s, which I expect to prevail in public networks), a tragedy of the commons problem puts them in a similar tsunamic position as private commercial networks.

Real-Time Is A Must-Have for the IoT

Today’s internet is mostly real-time (for a good time waster, check this out) and the Internet of Things needs to be real-time, too. If you disagree with this statement you are either invested in non-real-time technologies or you still watch the CBS Evening News at 6 p.m. on Channel 2 with your black and white television. The importance of real-time data — with a real-time query capability that is transacted at the edge of the network — should be obvious to anyone familiar with the scalability and operating challenges of future IT networks. Just in terms of time, consider industrial, commercial, and public safety applications where minutes or seconds of latency can render data non-actionable and often worthless.

“We clearly see data and content being created at the edge of the network … this content won’t be sent over the network to be processed by the ‘enterprise-based’ cloud infrastructure. Rather, you will need cloud computing-like processing at the edge.”— Vernon Turner, SVP @ IDC

There are only two essential jobs of a low power wireless IoT technology:

  1. Immediately send a message when an important event occurs.
  2. Immediately and accurately answer queries, simple or complex, from an authorized user.

The first job is something that many IoT technologies don’t do that well. A few do it on an as-it-happens basis (e.g. IF the temperature in this storage facility falls below 40 degrees, THEN send an alert message to router). Those who can’t do this usually overcompensate by sending everything every few milliseconds so that nothing gets missed and someone else cleans up the data aftermath. Still others send a regular update every 10 minutes or 10 hours (!) since their network design or choice of wireless frequency prohibits anything more frequent.

The second job is one nearly every one of today’s technologies fails to do. Here’s an example of a simple real time query request:

Query: “I need the location and pressure statistics for the past five hours on any fire extinguisher in the metropolitan area that was touched by XYZ Corp this month.”

But the response to this request using today’s IoT connectivity options would go something like:

Result: “Will do, sir, but we have 50,000 endpoints in the area so we should be able to get the data to you by Friday around lunchtime.”

“IoT threatens to generate massive amounts of input data from sources that are globally distributed. Transferring the entirety of that data to a single location for processing will not be technically and economically viable.” — Joe Skorupa, VP Distinguished Analyst @ Gartner

The IoT Needs A Google

The data tsunami is a complex business and technical problem, and Occam’s Razor says to use the simplest approach to solve complex problems. The simplest approach to solving the IoT data tsunami is to fundamentally rethink the role of IoT endpoints. Today’s endpoints can often be (simply) abstracted into something like this:

This approach is not only not addressing the data tsunami of today, it provides almost no room growth from future developers who want to exploit the data on an endpoint any number of creative and innovative ways no one has even countenanced.

Google is the most popular way of searching the world wide web due to its simplicity, speed, and effectiveness. It’s time to think of IoT endpoints more like web servers with an integrated database/filesystem that can be queried on demand or set to generate alerts. In other words, the Google model.

Like a web server, the ideal default for an IoT endpoint is to require that it remain silent until an authorized user queries it with a specific question or command. Keeping IoT devices in quiet mode forces them to batch data and “speaking only when spoken to” solves for many problems simultaneously:

  1. Enables higher quality, real-time data queries
  2. Pushes computing and analysis to the extreme edge of the network, reducing network congestion and latency
  3. Improves network capacity, reducing hardware, storage, power, and labor costs
  4. Reduces power consumption at the endpoint, router, and data center.
  5. Improves privacy and security

But what if a sensor is tripped without a query? This is important as well and IoT endpoints need the flexibility to transmit to the network when a specific event occurs, like, say, a sensor threshold is breached or a motion sensor detects a cat walking by.

The IoT endpoint architecture of the future would ideally be (again simply) abstracted like this:

Note that many IoT vendors will continue to advocate for the current “dumb endpoint” approach, claiming that a router or gateway that connects to the endpoint can do the parsing, filtering, and storing of data — a sort of “near-edge” computing paradigm. But their argument is rooted in the inherent weaknesses of incumbent wireless technologies and does nothing to improve wireless radio spectrum capacity, endpoint battery life, endpoint privacy and security, or storage costs. It’s a little like saying that Google should be indexing content delivery networks like Akamai instead of individual websites themselves.

How To Add A Google To the IoT

Querying an IoT network Google-style may seem obvious or easy, but it’s also possible that it wasn’t built (until now) because it is hard to do. At a company I co-founded, Haystack, we designed an endpoint device filesystem and low latency, query-based device architecture to enable exactly this kind of real time data retrieval. The technology, called DASH7, does this (and more) uniquely in the marketplace:

Haystack is still a young company, but getting good reviews and we submit that our approach as a step in the right direction to help avert the IoT data tsunami.

Beyond Real-Time Networking

The practical benefits of a real-time IoT at the network’s edge mentioned so far are really just the start and the ability to query endpoints directly — with or without the cloud — gives rise to entirely new opportunities. For endpoints in public networks, we will see public API’s and paywalls for individual IoT endpoints. Advertising or e-commerce in association with specific endpoints. Endpoints that behave in unique ways when queried by or in the physical presence of another unique endpoint or person. For private networks, business intelligence applications can now bypass the latency of today’s wireless IoT technologies and see faster (in a few seconds, in most cases) and more meaningful reporting. Just as some of today’s biggest web-based phenomenon like YouTube or Facebook would likely not exist had Google not perfected the indexing of the web, some of the biggest innovations in the IoT await the ability to spider and search wireless endpoints.

Note: you can reach me at pat at haystacktechnologies dot com