Tableau and Power BI Shouldn’t Be Your Go-to Data Analysis Tool

--

Photo by Cesar Carlevarino Aragon on Unsplash

Visualization tools such as Tableau and Power BI are incredibly popular, and they are what many people think of when they hear the term analytics. However, analytics is much more than dashboards created in these tools. Yet the real details of analytics seem to go unnoticed and even disregarded by most analytics departments. Analysts and teams frequently view the dashboard, the front-end visualization, as the product as opposed to the entire architecture. This leads to multiple sources of truth, massive inefficiencies, large amounts of technical debt, and a lot of bugs. Yet all of this goes largely undetected in most organizations.

There are two common things that happen in the day-to-day life of the data analysts and each of these areas have opportunity for improvement. First, analysts will frequently receive ad-hoc analysis requests from stakeholders, and the analysis will be performed inside of Tableau or Power BI as the go-to analysis tool of the data analyst. Second, analysts will develop dashboards that are meant to be used day-to-day by stakeholders. These dashboards can be viewed as “production-level” reporting. In practice, here’s what happens for ad-hoc requests, why it’s a problem, and a better way of working. I’ll save the production-level dashboards topic for another day.

For ad-hoc requests, analysts really shouldn’t be using Tableau and Power BI to do the analysis. These tools are geared toward building production-level reporting as opposed to performing an ad-hoc analysis and here’s a few reasons why they shouldn’t be used.

First, they can’t handle massive amounts of data. An analyst can pull a few million records into these tools, but anything more than that will grind to a halt or cause the analyst to idlily wait a long time to get the data that they need. This leads to an unproductive data analyst which is a financial cost to the organization, but it also prevents the analyst from increasing their scope of work. As you probably know, increasing one’s scope of work is required for a promotion.

Second, it’s quite common that analysts pull in more data than they need. Since most analysts are using these visual programming tools, they frequently overlook the potential negative impact of pulling in too much data, so tend to take more than they need. Not only does this grind the application to a halt for the analyst but it also puts extra load on the database server to query data that isn’t necessary. This leads to slower queries for everyone in the company and higher compute costs.

Third, it’s incredibly hard, if not nearly impossible to properly perform quality assurance checks on the data that is being surfaced in these visualization tools. There is more than a half-dozen layers of filters along with hidden formulas (much like how formulas exist in Excel) and that’s on top of how the data is queried and joined on the database side. This is why it’s so hard to perform quality assurance checks and way most checks fall by the wayside. It’s also why so many bugs get created within dashboards and why bugs frequently go unnoticed for long periods of time.

Fourth, if the analyst is functioning as a high-value data analyst, they’re going to want to be curious and deep dive to understand the why behind whatever metrics and data were pulled into the tools. This means going back to pull more data, not spending time efficiently, and putting themselves at risk of creating more bugs.

Fifth, the code can’t be easily shared in future projects or with co-workers. This means that nearly every ad-hoc request from every analyst is operating in a silo where the analyst is likely reinventing something that they had created in the past or that another analyst had already created. Again, this is how analytics teams end up with multiple sources of truth, inefficient operations, higher database utilization, and higher costs. With these visualization tools, there’s no clean way to share your code. If a co-worker requested the code to get the same data with the same calculated fields, someone is going to have to write the SQL version of what the dashboard produced.

I could continue listing challenges, but the point is this: there’s a lot of issues using these tools and there’s a better way. Everything that the analyst needs to do can be done using SQL, except for creating charts and graphs. But SQL can easily be used to generate summary data which can quickly be copy/pasted into Excel for ad-hoc chart purposes. By using SQL, nearly all the problems mentioned above disappear.

There’s no longer an extra and unnecessary later between the data in the database and the final output. Many times, all that an analyst needs are the tabular data anyway to pass along to a stakeholder. By removing that layer and writing actual code and not allowing a visual programming tool to do the work, the analyst is forced to think a bit harder about what data is needed, thus reducing the chance for querying all data. This also makes it easier to QA the code and output because the code isn’t hidden in a bunch of objects or “cells” like we see in Excel.

With more visibility into the code, the greater the chances of the analyst spending time attempting to perform QA checks and the greater the chance of detecting bugs. Lastly, the code is shareable because it’s actual code. It’s not just a visualization in an application. This means that more team members can utilized the same code across multiple projects and requests, which leads to more efficient operations and reduced sources of truth.

So, if you’re a data analyst, I challenge you to try to do all your ad-hoc work using SQL. When you do, you’ll find yourself producing fewer bugs, delivering higher quality results, deeper insights, and having more time to take on bigger projects. Also, your data engineering team will be happier because you’ll be reducing load on the database, which reduces costs for your organization and allowed everyone’s queries to run a faster!

Brandon Southern, MBA, is the founder of Analytics Mentor, specializing in providing analytics advising, consulting, training, and mentorship for organizations and individuals. Brandon has been in tech for 20 years in roles including analytics, software development, release management, quality assurance, six-sigma process improvement, project & product management, and more. He has been an individual contributor as well as a senior leader at start-up companies, GameStop, VMWare, eBay, Amazon, and more. Brandon specializes in building world-class analytics organizations and elevating individuals.

You can learn more about Brandon and Analytics Mentor at http://www.analyticsmentor.io/about

--

--

Brandon Southern - AnalyticsMentor.io

Founder of AnalyticsMentor.io . Analytics and Leadership evangelist. 20 years experience in software development, quality assurance, analytics, and leadership.