New to Tableau? The most important thing you should know.

Chris Geatch
5 min readDec 5, 2022

--

You’ve just been given Tableau to play with for your data visualisation, and you’re keen to get started - you’ve dumped some of your existing data into it, you’ve dragged some fields onto some worksheets for a few minutes, it all looks pretty, then you get stuck in and you’re sitting thinking, “This is rubbish, why can’t I just <insert anything here>?”

The answer is, actually you can but, from my experience, there’s about a 90% chance that your data is the problem, not Tableau. There is obviously a learning curve around how to do things in any application, and I’m not here to give you the knowledge to create everything you could imagine (or even start making it look pretty, ugly charts to come), but if your data is “wide”, with a column for each variant of your measure, then you’re going to struggle in Tableau, and I don’t want you to do that. If your data looks something like the image below, with columns for each year, and you’ve got 5 minutes (that will save you days worth of time down the line), then you definitely want to read on.

An Excel spreadsheet with revenue and volume data by Month, in columns for each 3 years
Wide Data in Excel

In the vast majority of cases, Tableau works better with “Tall” data. There’s plenty out there on how to use the functionality of Tableau, but where people often get stuck is right at the start. They can’t make Tableau do the cool things they find online, because their data simply doesn’t allow it. If you can pivot your data (either at source or you can do it in Tableau itself), that is the single most important thing I can recommend before you get started.

An Excel spreadsheet showing the same data as the previous image, pivoted so that there is a column specifying the year, instead of it being split out by columns
Tall Data in Excel

As soon as you get the data into Tableau, you start to get an inkling of the possibility there might be more work involved with the data on the right (the original, wide data). Two quick wins here are:

  • Fewer fields to drag around
  • Tableau will apply any order changes to all other nested fields, e.g. if I drag TY to be at the front in Revenue, it will also move in Volume. With wide data, you’re on your own.

Obviously, you want to start creating some analysis, and a typical measure might be Yield, the average Revenue per passenger (Volume). I’d also like to consider how this changes between one year and the next, both in absolute and percentage terms.

The fields you’ll have to create using wide data would be:
Yield TY, Yield LY, Yield PY, Yield vLY, Yield vLY%, Yield vPY, Yield vPY%

The field required using the tall data would be:
Yield

When you have a categorical field for the Year, Tableau can automatically create all of the relationships required for a number of calculations: percentage difference, running sum, percent of total… with the facility to do any others with manual calculations. We can still get the same table with both sources, but notice the Measure Values are all the same field when using the tall data (top right in the image below). The little triangle means a “Table Calculation” has been applied, to get the vLY figure and the vLY%. No extra work required (but you might like to read about Table Calculations).

It’s usually at this point where you start thinking, “I want the user to be able to choose which year to compare to.” Parameters are a standard tool in Tableau, and you can use one to switch between what is displayed. With the tall data, you can be a bit sneaky and just filter out the year you don’t want to use, i.e. if you filter out PY then the calculation will automatically use TY and LY (so vLY), and vice versa. No such luck with the wide data, and you’ll need two more calculations to choose which year’s absolute and percentage values to display.

This post isn’t to show you how to do all of that. My point is to consider your data structure before you start. For the simple table I’ve created above, with wide data I needed 9 calculated fields. To create the same thing with tall data I needed one real calculation, and a filter to tell my table which years to use.

This is the simplest example I could think of, and a single worksheet, so you can probably see how your number of calculations can quickly get out of hand. Even relatively simple charts can be painful to create with wide data, and you quickly become limited in what you can display. As a final example of this, I might want to display Revenue alongside Volume (on a dual axis) for all 3 years. I can’t do this with wide data. The best I can do is have 3 charts with the Revenue and Volume on dual axes. This is the kind of moment where someone might think, “Tableau doesn’t do what I need, I’ll go back to Excel”, or create even more calculated fields to get what they want. In addition, the scales aren’t consistent so I can’t easily compare years, because Tableau has no concept that Revenue TY is related in any way to Revenue LY (they could equally be the number of fish and total lightbulbs broken), whereas if I wanted to plot the data like the left hand chart with the tall data source, I could tell Tableau to create uniform y-axes for each of the years, because then it would know that Revenue should always be on the same scale.

Hopefully, this all makes my point that, if you consider your data structure up front, everything becomes way easier, and that cool visualisation you’re trying to replicate will now be within your grasp.

--

--

Chris Geatch

I'm a two time (2022-24) Tableau Forums Ambassador, so thought I'd better make a start on the blog for solutions to problems I (or other people) make for myself