Member preview

Teaching a new dog old tricks

What the history of the Internet can teach the Internet of Things

Talking at The Thing Network Conference in Amsterdam, Feb 2018. (Image Credit: The Things Network)

Transcript of the talk I gave at The Things Network Conference in February 2018.

As a group developers tend to think far more about the future, than the past. Well, unless that is, we’re obsessing about retro video games. But every once in a while it’s worth it to take a step back and look at history, and then decide whether we want to repeat our mistakes, and also triumphs, just one more time. Or whether we should be doing something different this time around.

Because we’ve been here before. Because while the Internet of Things is very much in its infancy the other Internet, the digital one, while perhaps not having yet reached its middle years has, at least, started to outgrow its adolescence.

The IMP log with the very first message sent on the Internet. (Image credit: Andrew Adams)

The first message ever sent over the ARPANET—the network that’s widely regarded as the most direct predecessor to the modern Internet—was by student programmer Charley Kline from a computer at UCLA at 10:30 pm on October 29th, 1969, when he attempted to login to a second machine, located at the Stanford Research Institute in Menlo Park, California. He managed to get as far as the “o” of login before the system crashed. The first actually successful connection was made about an hour later.

These machines were connected by two Interface Message Processors — or IMP — machines. The IMP was the first generation of gateways, known today as routers, however despite being almost synonymous with the Internet as we know it today, the connection between the two machines wasn’t a TCP connection. The TCP standard didn’t yet exist.

The birth of the other Internet — the digital one — was tied up with a vicious standards war. Something that’ll sounds horribly familiar to those of us now involved with the Internet of Things.

However the fallout from this standards war—fought over the course of twenty years following the first message between those two original computers on the ARAPNET—fundamentally underpins how we use the Internet today, and what’s surprising is that things didn’t work out how everyone expected. The rebel alliance, won.

Looking back the history of the Internet could have been very different. In the mid eighties the proposed OSI standards were the obvious choice.

In fact in 1988 the U.S. Department of Commerce issued a mandate that all computers purchased by US government agencies should be OSI compatible starting from the middle of 1990. Yet two years later, when that date arrived, the battle was already over, and the OSI standards had already lost.

The OSI model survived as an abstraction, as a tool to torture undergraduates, but the protocol specification, designed according to the model didn’t.

In fact by the early nineties the dominance of TCP/IP was almost complete. For instance in January of 1991 the British academic backbone network, called JANET, which at the time was based around X.25 coloured book protocols, established a pilot project to host IP traffic on the network on top of existing X.25 traffic.

Within ten months the IP traffic had exceeded the levels of X.25 traffic. IP support became official in November that year, and a year or two later X.25 traffic over the JANET backbone had disappeared entirely. It had become just another part of the Internet.

Although there were still alternatives. FidoNet was a packet-based store-and-forward network for email and files that paralleled the growth of BITNET—another of those networks that eventually became today’s Internet. However unlike the networks that would eventually make up the Internet, connections between nodes on the FidoNet weren’t permanent.

Built around dial-up modem connections between bulletin boards, the first two FidoNet systems came online around Christmas of 1983. By it’s peak in 1996, FidoNet connected over 39,000 systems serving just under 2 million users.

However throughout its lifetime, FidoNet was beset with management problems and infighting. Much of this can be traced to the fact that transferring data over expensive long-distance phone lines cost the participants real money. Something to bear in mind when thinking about cellular, or satellite, backhaul for LoRa.

Because unlike the Internet, the FidoNet was an entirely volunteer — community based — network. While there was in theory a hub and spoke like architecture, in practice the entire thing operated in a rather ad-hoc manner with some board operators making direct connections, peer-to-peer, outside their allocated zones. Which also has some interesting parallels for private-private peering for LoRa networks.

The FidoNet system was adapted to an environment in which local telephone service was inexpensive and long-distance calls costly. So when that went away, so did FidoNet. From its peak in the mid-90‘s the node list shrank rapidly. Although even today there are still around 2,500 nodes, in areas where Internet access is difficult to come by, and inevitably of course, there are now gateways between the remaining FidoNet Echomail services and the Internet.

Motherboard and VICE are building a community network in Brooklyn. (Image credit: Lara Heintz)

The Internet didn’t entirely kill all community based networks. They haven’t gone away, and perhaps worryingly in a way, in recently years their number has started to increase once again.

Although the places where there appear are, at least for now, usually somewhat out of the way. Media smugglers get the Game of Thrones, and the New York Times, to Cubans every week through an illegal network of runners. Effectively implementing a real-life sneaker net. No word on whether they’ve also implemented IP over Avian Carrier.

Slightly closer to home many rural communities have gotten tired of slow broadband, and started to lay their own fibre, or deploy microwave links. While in parts of rural America the barbed wire fences separating fields are being put into service to run broadband signals. Some activists are even laying the groundwork for an open source DIY cellphone network.

A lot of these community efforts are being driven by the net neutrality debate, and for me it’s interesting to see that—in the face of a fundamental threat to the way the Internet works—its open source nature means that the traffic prioritisation can be patched by community action rather than further standardisation, or worse yet restrictive legislation. Abet with and an enormous cost in, time, energy, and attention, by everyone involved.

Other community networks exist in parallel, or now sometimes now on top, of the Internet, and have come into existence due to other perceived threats.

For the last 30 years, we’re seen an increasingly aggressive erosion of our privacy online and the perceived need for anonymity, privacy, on the Internet has created community layers on top of it, like Tor, and you have to wonder whether similar network layers will start to appear on top of LoRa.

However, a lot of the problems we’re seeing with the Internet today are due to how the Internet was built and the arrival of a new application, a new service, that broke some of the expectations of the people that wrote the standards and protocols it has been built upon.

Well, hardly new, this new application arrived 25 years ago. Because it has actually now been 25 years since the birth of the web, which changed the Internet forever…

I still rather vividly remember standing in a draughty computing lab, with half a dozen other people, crowded around a Sun Sparc Station looking over the shoulder of someone who had just downloaded first public build of NCSA Mosaic via some torturous method or another. I also remember shaking my head and saying “It’ll never catch on, why would you want images?”

But the arrival of the web, broke the Internet that sits underneath it. Because there really is only one business model on the web, and that’s advertising. People have consistently refused to subscribe to services or pay for content.

Instead, advertising supports the services that sit underneath almost everything we do on the Web and behind advertising is the data that makes it possible. Think about how your day-to-day experience of the Web would be different if Google charged a monthly subscription fee for its search service, or used a micro-payment based approach to charge on a search-by-search basis.

A series of almost accidental decisions and circumstances have led to a world where most things on the web appear to be free. That doesn’t mean they are free, just that we pay for them in other ways. Our data and our attention are the currency we use to pay Google for our searches, and Facebook for keeping us in touch with our friends.

Whether you view that as a problem, as the folks that created the Tor network do, is a personal choice. But it was perhaps not unanticipated. With no idea of the web, which was still years in the future, the people building the Internet did think about the possibility.

This from Alan Kay, written in 1972,

who anticipated the black rectangle of glass and brushed aluminium that lives in all of our pockets today, and the ubiquity of ad blocking software we need to make the mobile web even a little bit useable.

Marshall Rose, who was the chair of several IETF working groups at the time the Internet standards battles were being fought, has this to say,

“Twenty five years ago a much smaller crowd was fighting about open versus proprietary, and Internet versus OSI. In the end, ‘rough consensus and running code’ decided the matter: open won and Internet won,” — Marshall Rose

The TCP standard won, not because it was better, that’s arguable. But because of a ‘rough consensus and running code’ which was far more important than getting everything right, right then. Today that is even more important than it was then. After all, the early battles around transport protocols aren’t the only standards battles we’ve seen where the rebels have won.

By the mid-noughties SOAP and XML were seen as the obvious way to build out the distributed services we all, at that point, already saw coming. Yet by the end of the decade SOAP and XML were in heavy retreat. JSON was free of XML’s fondness for design by committee. It also looked more familiar to programmers. RESTful services and JSON, far more lightweight and developer friendly had won.

Despite that, depending on which standards body you want to listen to, ECMA or the IETF, JSON only became a standard in 2013, or 2014.

“JSON appeared at a time when developers felt drowned by misguided overcomplicated XML-based web services, and JSON let them just get the job done…” — Simon St. Laurent

However it’s actually sort of unlikely many people have actually read those standards, and this includes the developers using the standards, and even those implementing the libraries developers depend on.

We have reached the point where standardisation bodies no longer necessarily create standards, to an extent they formalise them, and the way we build the Internet of Things is being fundamentally influenced by that new reality.

The open sourcing of the Things Network network server, is a crucial stage in the development of the LoRa network. The thing that made the Internet the Internet after all is the peering agreements between the component networks. The thing that made TCP dominant wasn’t the standard itself. It was the community that formed around it, the developers that found it easier to use than the competing standards, and so produced running code. Other developers that then created content and services that sat on top of that code, and the users that found everything so much more useable.

Rough consensus, running code.

However, the thing we have to remember as we’re building the Internet of Things, as we’re putting together the networks that underly it, is that right now we’re building the Internet before the web. Before NCSA Mosaic gave us images.

The Things Network is only a couple of years old. What happens to it, and to LoRa, when you’re one of the half a dozen people crowded around Macbook in a warm co-working space with an espresso (how things have changed) and saying “…it’ll never catch on, who needs that?”

While Alan Kay’s prediction of the existence of the smartphone was almost prophetic, the IETF was in a way naive. It was a simpler time, without the ubiquitous panopticon of the modern world, without the security threats, which arguably shapes the modern Internet, and our view of it.

As we saw in the browser wars, implementation now tends to lead standardisation. The standards we arrive at for the Internet of Things will likely be developers deciding that what they’re doing is good enough for now, that they should do it that way until people make up their minds about what we all really should be doing.

Consensus around standardisation is now being built outside of the standards bodies, not inside them, and while the issues we deal with with the other internet are well known, privacy, security, and neutrality, the issues around the Internet of Things are different.

One of the biggest problems with the Internet of Things right now is around the concept of ownership. As customers we may have purchased a thing, but the software and services that make the thing smart seem to remain in the hands of the manufacturer.

Last year for instance John Deere told farmers that they don’t really own their tractors but just licenses for the software that makes them go. That means that, not only can they not fix their own farm equipment, they can’t even take it to an independent repair shop. Which is something that changes the very idea of what it means to own something.

For instance recently — as Hurricane Irma born down on Florida—Tesla, the electric car company, issued an over the air software fix for their cheaper car models in Florida that temporarily gave their drivers an extra 30 to 40 mile range.

The cheaper models had been software locked to use only 80 percent of the available battery power. The remaining battery capacity would only normally have been unlocked by paying extra.

While security is always a concern, the sorts of security threats we’re facing now could be very different than we’ve faced in the past. For the most part, the security on the digital internet is about confidentiality, for years, the security industry has been all about trying to prevent data theft—not always successfully.

A few months ago we learned that at the tail end of last year sensitive data concerning the F-35 Joint Strike Fighter and JDAM smart bombs were stolen from the an Australian defence contractor. Access was initially gained by exploiting a 12-month-old vulnerability in the company’s IT Helpdesk Portal

But they needn’t have bothered because, as it later turned out, at least one of the contractor’s Internet facing servers had a default administrator password, and a guest account.

We haven’t yet seen a widespread data breach for the internet of things, for the most part the problems with data from the data there are, different. The problems are the size of the data sets make aggregating the data somewhat dangerous.

NYC Metro Area Taxi Dropoffs, 1.3 Billion points plotted. (Image credit: Ravi Shekhar)

For instance, take the dataset obtained using a freedom of information request from the New York City Taxi and Limousine Commission. This dataset contains every taxi trip in New York from January 2009 through till the June 2017.

The data includes pick-up and drop-off times, locations, trip distance, the itemised fare, passenger counts, and even how the passenger paid. People have built some amazing visualisations from the data since it was released. But there’s a big problem, the personal identifiable information—the driver’s licence number and taxi number—wasn’t anonymised properly, and you can easily tell which driver drove which trip.

Worse yet, at least from most people’s point of view, passenger identities can be de-anonymised through GPS harvesting. Multiple pick ups at a commonly frequented location — a public building, a bar, a club perhaps — with drop-offs clustering at another location for instance a home address, lets you find more regular people and identify them as patrons, or customers, of certain establishments.

Soon (perhaps already?) data like this will be streamed in real time, rather than requiring a Freedom of Information request. Data will come from things like weather stations, electricity monitors, or smart thermostats, or your fitbit.

Strava data for London Heathrow Airport (Image credit: Strava)

For instance Strava’s recent release of their data as a heat map data wasn’t totally anonymised. By manipulating the Strava API it’s possible to de-anonymise the company’s latest data release and show exactly who was exercising inside the walls of some of the world’s most top-secret facilities.

Once someone makes a data request for a specific geographic location — a nuclear weapons facility, for example — it’s possible to view the names, running speeds, running routes and heart rates of anyone who shared their fitness data within that area.

General Data Protection Regulation (GDPR) which will come into force throughout Europe, and yes, even in the UK — despite the horrors of Brexit — could well help to alleviate these sorts of data de-aggregation problems.

It should have a significant impact on the design of smart devices, and the business models behind them, and hopefully encourage the industry to make different and more ethical design choices.

Because today our privacy, or lack of it, is inherent in the design of our smart devices. We might even need to consider the idea of GDPR wrapped software, instantiating the law in a way that means that security problems, data breaches, are less serious.

Because so far the wide scale security compromises we’ve seen with the Internet of Things have been unsubtle.

Mirai for instance was a piece of malware that identified vulnerable IoT devices using a list of just 60 (or so) common factory default usernames and passwords, then it logged into them, and took them over to form part of a botnet

Infected devices continued to function perfectly normally, except for occasional sluggishness, and an increased use of bandwidth. So their owners generally didn’t notice anything is wrong.

The types of devices that were taken over were, for the most part, ‘just’ IP cameras, butdespite that, the botnets created by the Mirai malware have been used to perform some of the largest and most disruptive distributed denial of service attacks ever recorded.

More recently there has been Wannacry and not-Petya. Not specifically targeted at the Internet of Things both shut factories, railways, even wind turbines.

But these are just leading indicators, because the attacks are becoming more subtle. The concepts of malicious data, maldata, and data spam aren’t in wide circulation yet, but they soon will be.

One leading indicator for that sort of attack is the recent spate of ‘fabricated data leaks.’ These are, purportedly, data sets of accounts and personal information originating with websites or cloud services, except that they’re almost entirely fabricated from thin air.

The emails addresses might be real, but the other information—date of brith for instance—are just, wrong.

For the most part the attacks against the Internet of Things have been constrained by, well, the Internet. By how the attackers think about what a computer is, but that’s changing.

There was a case recently where a one vineyard owner started feeding false data to their neighbors’ soil sensors. The low moisture readings from the sensors triggered sprinklers, and led them to overwater the vines. They ended up paying fines for excess water usage.

It’s possible that maliciously injected data, subtle changes — a 1 here, a 0 there — might be more of a threat than the brute force (and very obvious) distributed denial of services attacks we’ve been seeing up till now.

While the arrival of machine learning on a chip — the Intel Movidius VPU for instance — could potentially help make “smart objects” actually smart, rather than just network connected clients for machine learning algorithms running in remote data centres. With the potential to allow us to put the smarts on the smart device, rather than in the cloud.

But it might also be the thing that breaks our current architectural assumptions around the Internet of Things. Like the arrival of the web, things will still work, they’ll just be different.

Historically separate networks peered to become the Internet, routing traffic from one network to another destroyed walled gardens — like Compuserve — and community networks alike. The value of seamless hops for your data, and access to everybody, and their content, is obvious.

But the Internet of Things, isn’t the Internet.Despite the name there has been very little done to connect one network of things to another so far most things exist in silo’ed networks, with gateways to the ‘real’ internet where carefully curated analytics are displayed about the things themselves.

That isn’t an internet, that’s just ‘networked’.

Despite claims to the contrary from some parties, the Internet of Things needs low powered, long range, networking standards. WiFi, and cellular, aren’t solutions to most of the IoT use cases I’m seeing.

LoRa is the only one that is open, that offers the possibility for things on the Internet of Things to be part of the Internet rather than just on the Internet. But it remains to be seen whether this is the TCP of the Internet of Things, whether the networks we’re building today are the ARPANET, or the FidoNets, of history. The next year or two will decide that.