Despite everything you hear about “Big Data” these days, here’s a piece of news: Anything you will ever see in your life is “small data”.
In fact most of it is miniscule. Even the best designed and most informative dashboard will rarely show more than a few hundred data points:
- A bar chart only shows as many data points as the number of bars on the chart.
- Despite millions of trades and price changes per year, a line chart of a stock’s price can not logically show more data points than the number of pixels across your screen (probably approximately 2000 at most).
- A scatter plot falls down as points start to significantly overlap, and the data will be better represented as a heat-map or other aggregated views of the data.
- Maps are possibly the densest data visualizations there are, and a full-screen map is possibly the only visualization where every pixel on your screen could be said to hold meaningful information. Yet what we take away from looking at a map are usually only a handful of data points: “Get on route 9; On 95, take exit towards Portsmouth; Drive a couple of miles to exit 22; You’ll see the office on the right”.
And that is kind of the point. Data visualizations are supposed to give us a simple and informative view of — sometimes — big and complex matters.
“Big Data analytics” is obviously important, but at any given time a human will be looking at a few thousand data points — at most. Technically, humans are never looking at Big Data, but aggregations, selections and extrapolations of big data sets distilled as “small data”.
Big Data analytics is about efficiently identifying, surfacing and sometimes generating (aggregates, samples, etc.) small, representative samples of the Big Data and serve them up to us humans.
The only true consumers of Big Data are machines.
This post was inspired by a recent interaction on Twitter.