moznotes.com

Sensemaking

Forays into data science

Mo Patel
Making sense
Published in
3 min readSep 10, 2013

--

Here is another collection of words on the all the wonders of data full of marketing buzzwords such as Big Data, Cloud Computing and Data Science. Not quite! The goal of Sensemaking will be to feature techniques that are being used to make sense of the massive explosion of data.

First, however I will use this space to explore, why now? In the last decade several key dominos have been set in motion that brought froth the data revolution. These dominos tipped due to several key factors:

  1. This thing called Internet exploded in 1990s. That resulted in creative minds focusing on large scale networked computing systems.Unfortunately, due to plenty of irrationality, Internet as an industry had a major correction around the year 2000. Thanks to that glut in infrastructure and technology investments, in 2000s cost to performing complex computing tasks on a large scale became cheaper.
  2. In 2000s, after the dot com crash, a new set of companies emerged and those that survived innovated their way out of the disaster. Many dot com companies may have crashed but the appetite to consume information via the Internet among consumers never ceased. The engineers at these companies were faced with challenges in scalability at low cost. Previously, costs were driven upward by proprietary software and hardware by large technology vendors. Engineers started to design systems that were focused on using and building open source hardware and software. In late 2000s, this engineering innovation gave birth to cloud computing.
  3. From large warehouse sized mainframes to smart phones, over the last few decades, computing has condensed into smaller and smaller form factors. This portability in computing has resulted in mobile computing and increasing interaction between computers and environment. As sensors continue to be loaded into computing devices, the resulting stream of data they collect and transmit is ever increasing. Most smart phones introduced in the last year or so now have multiple cameras, gps, accelerometer, compass, gyroscope, audio sensors to name a few. In addition to personal mobile computing such as smart phones and wearable devices, sensors have also been placed on physical objects such as roads, bridges, buildings, homes and cars. These objects are constantly collecting data and contributing to the variety of data sources, streaming at variable speeds in large volumes.
  4. With the combination of cloud computing and portable computing, we now live in an age where data is generated, stored and processed at a very rapid pace in large volume and variety of formats. Mathematicians (Statisticians & Computer Scientists et al), General Scientists (Physicists, Chemists, Biologists et al) and Social Scientists (Economists, Psychologists, Anthropologists, Linguists et al) have for centuries devised methods to extract insights from data. With the advent of computing, these methods have relied upon computers to process data and derive insights. However, one major breakthrough in this field is ability to perform these methods at very large scales with rapid response times. Cloud computing distributed data processing architectures have allowed us to scale machine learning, data mining, statistical analysis and artificial intelligence methods over large volumes of data sets and present them to millions of consumers over the Internet.

The four factors described above have brought us to this nexus of technological innovation where we can apply new and old methods to new and old data and derive knowledge that can be distributed to millions.

--

--