Random thoughts on the Communication vs Computation Continuum
Recently I watched/listened to a talk that Peter Levine gave named, The End of Cloud Computing, where he discusses the next major shift from centralized computing to distributed computing at the edges. He makes mention of how this is part of a cycle that we have seen repeat in computing. At one point there were dumb terminals which moved to personal computing, and then of course cloud computing. Peter theorizes that we are headed back down the road again to distributed computing as Internet of Things will need the decision making at the edges close to the data.
I think the broader concept is in fact this cycle of centralized --> distributed --> centralized --> distributed. What drives this phenomenon and is there a way to notice where on the cycle we are?
Dumb terminal computing was defined by an end user terminal that was strictly responsible for input and display while all major computation was done on a beefier centralized server. This setup was common in a university or office environment where the terminals would all be connected up over a local area network. The dumb terminals were relatively inexpensive compared to the server that did all of the real work and so it was an early form of virtualization. Each user was sharing computing time and resources with everyone else connected to the same server.
Personal computing was the first shift away from the dumb terminal phase. It provided for each user to have a full set of computation capabilities on their personal computer and no need for a network was required. This made it possible for users to have a PC in their home versus the previous dumb terminals which required an expensive network and central server.
This shift occurred with the smart phone’s rise to prominence. The computing power available to a mobile phone was no where near that of the PC which it followed and so new models had to be defined where the computing burden was shared between a centralized cloud and the end handset. What was initially a crutch, was turned into an advantage as the focus shifted back from end user computing to a centralized, super networked, cloud with immensely more power available. Big data is a key benefit of this as these cloud computers can churn through data and generate insights at a speed never seen before.
Internet of Things
Peter Levine’s talk is about how the Internet of Things will force a shift back to distributed computation. His logic for this is that as we move forward and things like self driving cars are processing massive data, they will have to compute at the edge because the decisions will need made faster than the latency involved in sending data up to the cloud and getting a response. I think this is a fair assessment on the self driving car, but what about the rest of IoT like thermostats, refrigerators, fans, HVAC, and everything else. There would still be a lot of things that could deal with the higher delay in waiting for the cloud response.
So Peter Levine is wrong?
No, I don’t think he is wrong at all. I agree with him that we will again see a push to computation at the edges. And he even mentions what I think is actually the core reason for this happening. He discusses that IoT is going to bring an explosion in the amount of data being processed like we have never seen before. GB’s of data in seconds and that is now, add in a couple of orders of magnitude and you can see how TB’s or PB’s of data every few seconds could be generated from all of these devices.
But the cloud was built for data processing
We just discussed though how the cloud was built for data processing and can scale to achieve the necessary PB’s/second of processing needed to handle all of this data, right?
We missed something
The missing element is the throughput of our communication networks. If the communication network is slow, then we have to decentralize, and if the communication network is fast, then a centralized model works better because operational support, utilities, etc are all cheaper to aggregate in fewer locations. But this is a continuously improving process and as one of these two components gets to be the choke point in the process, then resources shift towards innovation and it starts moving the needle back in the other direction.
Through the lens of communication/computation
For example, lets look at the shift from dumb terminals to personal computing using this model. Dumb terminals work great in an office setting where the network communication speeds were a blazing fast 10Mb/s (for that era in computing). The demand for computing at home started to rise though and the only communication method from homes was likely a modem with maybe a speed of 2.4Kb/s. That means the office network was over 4000X faster which made dumb terminal work, but when you slowed that down to modem speed, it was never going to be possible. So the end computers had to get more powerful to support this use case of personal computing as the network became the bottleneck.
Cloud computing can be looked at in the same light then as network speeds for both home and mobile users increased to a point that the computation could again be more efficiently ran in a centralized cloud offering. The network was no longer the bottleneck and so we could centralize computing and open new fronts in the amount of processing that could be utilized on to innovate.
IoT is going to force us back the other way as the data generated by IoT is orders of magnitude larger than what we use currently and it will seem like we went back in time to that old example of dumb terminals. They worked great when we were sending a manageable amount of data back and forth, but get ready to feel like dial up again if you want send the amount of data needed for IoT decisions to the cloud before getting a response.
IoT will again force computation at the edges and thus will face similar challenges to personal computing. Even though there can be some centralized summary of data processed in the cloud, the full dearth of data available cannot be consumed in aggregate in the center to review and optimize future decisions. It will inherently mean that some edges will work great and others will not, but it will be extremely hard to troubleshoot the where and why at scale. It will require pinpointing the worst offenders and collecting them to analyze all of the data, but this will miss a lot.
Everything is cyclical
At some point this lack of network communication speed will become a limiting factor in moving forward and some new model that pushes back towards centralization will take hold to create even new models of using the IoT data that were never imagined before.
Its all a cycle where computation wants to be centralized for cost efficiency and also de-centralized for speed. This constant push-pull between cost and speed is what forces newer and better communication models and drives innovation forward. No matter what, it is exciting to be able to work on this future and shape the path that it takes.