Data Visualization Fundamentals
1. Using Data Visualization to Find Insights in Data
Data is invisible.
Data tables are also a type of data visualization, but tables alone don’t allow us to immediately identify patterns within the data
2. How to visualize?
- types of visualization
- Tables are very powerful when you are dealing with a relatively small number of data points and data with a single variable. (Edward Tufte suggested including small chart pieces within table columns.)
- Bar charts are perfect for categorical comparison.
- Line charts are especially suited for showing temporal evolutions. Like time-series data (where one variable is time/date/numbers of the same intervals).
- Scatter plot. (Date/Time data of different intervals is not time-series data and should be visualized in scatter plots, using line chart in this situation could be misguiding because it ignores the potential changes during the intervals, see the below example)
- Mapping graphs connect data to the physical world.
- Graphs are all about showing the inter-connections (edges) in your data points (nodes)
- analyze and interpret what you see
- What can I see in this image? Is it what I expected?
- Are there any interesting patterns?
- What does this mean in the context of the data?
- document insights and steps along the way
It’s a good idea to start the documentation by writing down these initial thoughts.
- Why have I created this chart?
- What have I done to the data to create it?
- What does this chart tell me?
- transform data
Most common transformation is log(). There are also other transformation depending on the data.
An example: when you have extreme values in your data, your barchart will look very unbalanced, you can barely tell the difference between b,c,e.
But if we transform the values to be log2 based, all the values will fall below 10. The chart looks much better.
- zooming
To have look at a certain detail in the visualization
- aggregation
- To combine many data points into a single group
-filtering
To (temporarily) remove data points that are not in our major focus
- outlier removal
To get rid of single points that are not representative for 99% of the dataset.
Which Tools to Use
Visualization and data wrangling should be easy and cheap, better done in the same software than separate softwares.
3. Data visualization DIY: Our Top Tools
Tableau Public
you can make pretty complex visualizations simply and easily with up to 100,000 rows.
Tableau is designed for PCs, although a Mac version is in the works
Google spreadsheet charts
including the animated bubbles used by Hans Rosling’s Gapminder
They are pretty design-neutral, which is useful in small charts
Many Eyes
it was a unique exercise in allowing people to simply upload datasets and visualise them
you can’t edit the data once you’ve uploaded it
Color Brewer
Not, strictly speaking, a visualization tool
is really for choosing map colors
There are lots of others out there too, including:
- Chartsbin A tool for creating clickable world maps
- iCharts Specialises in small chart widgets
- Geocommons Shares data and boundary data to create global and local maps
4 . Information Is Beautiful Awards
https://www.informationisbeautifulawards.com/showcase?page=1&pcategory=winner&type=awards