How to convert a table into long-form or tidy-form for seaborn visualizations

Anirudh Kashyap
Nov 20, 2017 · 3 min read

Presenting data to non-technical users is often a difficult task and seaborn creates excellent visualizations to bridge the gap between a data science and the audience. Seaborn is an excellent Python visualization library built on top of matplotlib that creates beautiful plots.

Often times seaborn requires the data in a tidy form. If you want to plot a factor plot where multiple categories are plotted on a same axis, the data needs to be in a long form dataset. What does this mean?

Tidy datatable or a longform datatable has the following characteristics:

  1. Each variable you measure should be in one column.
  2. Each different observation of that variable should be in a different row.

Let us see this in an example:

'''CREATING A COST/BENEFIT TABLE & PLOTTING IT ON A FACTOR PLOT'''approaches_cost = [425500, 275250, 101500] ## COSTS OF A PROCESS
approaches_savings = [-500, 149750, 323500] ## SAVINGS OF A PROCESS
cols_approach = ["Aggressive", "Moderate", "Conservative"] # TYPES OF APPROACHES
## CREATING THE DATATABLEwnv_approaches = pd.DataFrame({"Approach": cols_approach, "Cost": approaches_cost, "Savings" : approaches_savings})

Here is how the table looks:

Starting datatable

Now in order to create a visualization where I can see the cost & savings on the same axis with the approach on the X-axis, I need to create a factorplot. The factor plot documentation states the following:

data : DataFrame

Long-form (tidy) dataset for plotting. Each column should correspond to a variable, and each row should correspond to an observation.

So, to convert this table into a long-form, we use the melt function.

approaches_plot = pd.melt(wnv_approaches, id_vars=”Approach”, var_name=”Expense_Type”, value_name=”$ Amount”)
Tidied up table — ready to be plotted in seaborn

This creates the data into one column and lets seaborn plot it on the same scale (Y)- Axis. Here is how the plot looks.

Or switching the format to “bar”

So, now the obvious question is — how do I go back to my original data table? Well, pandas has a .pivot feature that untidy-es the datatable. It works like this:

## SWTICHING BACK FROM THE TIDY DATA TABLE TO UNTIDY DATA TABLEuntidy_approaches = approaches_plot.pivot_table(index = "Approach" , values = "$ Amount", columns="Expense_Type")untidy_approaches.reset_index(drop=False, inplace=True)  # ASSIGN INDEX = 0/1/2/3untidy_approaches.columns.name = None         #RESET THE INDEX NAME
untidy_approaches

Switching back from a untidy data table into a tidy datatable takes a little bit of effort and playing around with the parameters of df.pivot_table will help you get the table that you want.

Back to the old form

There it is. We have now seen a way to convert a datatable into its tidy form and then reconvert it back into the old self. This gives us the versatility to plot columns in seaborn.

Here is the entire sample code:

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store