Choosing an Analytics Tool. Metabase Vs Superset Vs Redash

Stefan Mihaylov
VorTECHsa
Published in
7 min readFeb 9, 2022

At Vortexa, we recently embarked on a journey to look for a decent analytics tool. We wanted something our engineers could use to throw together dashboards and alerts on top of our data lake. Something that would complement our live monitoring and concentrate more on the actual data we produce and its quality, instead of what we log.

Mind maps are quite useful for starting things off

We quickly realized we’ll need a structured way to approach the selection, as there is no obvious go to “one tool to rule them all”. We split the process into two parts.

  • Shortlist a set of tools using some basic criteria
  • Trial and evaluate the shortlist in more detail

Part 1: Shortlist

In order to reduce the list of options, available in the vast ocean, to something more manageable we used these criteria:

  • Price
  • Datasource support
  • Deployment&Support
  • Popularity
A taste of some parties considered

Again, we couldn’t try everything, so we had to be smart about how we spend our time. By using the basic criteria above we managed to bring the list down to three tools that we would actually experiment and play around with. The title being a dead giveaway, yes, these were: Metabase, Superset and Redash.

Part 2: Trial

Trialing the tools was all about expanding on the basic criteria, plus looking into the following:

  • Alerting and reporting capabilities
  • Single sign-on setup
  • UX and Dashboarding experience

We’ll now look at the more detailed analysis, which I need to preface with the obvious warnings. Some of the criteria is biased towards our own deployment methods and stack needs(Kubernetes, AWS Athena etc.) , so they might not apply to your case. These tools evolve and change pretty quickly, so other pains we’ve had might not exist by the time you read this.

In any case, lets jump right into the details.

Deployment

Infrastructure as code has been really important for us at Vortexa. As such, we’ve been using Helm to assist with our Kubernetes deployments. Having a good, configurable and up to date helm chart played a big role in this category.

Another important criteria here for us has been how easy is it is to debug problems and track failures, and things like multiple kube pods can make things harder. With some of the tools we’ve had to redeploy them a bunch of times until a successful ordered birth happens.

Metabase: The official Helm chart Metabase comes with has been sadly discontinued, so we’ve had to use a community one. As a dependency Metabase requires only a DB connection that it uses for storing state. It then spins up one pod in your cluster and configuration is pretty straightforward and can be done within the tool UI once.

Superset: Comes with official support for helm. The chart itself is pretty configurable and a lot of the functionality is driven within the setup here. i.e. things like Alerts, Reports and extra data source connectors. Dependencies are redis and a database, and it does provide you with an option to spin those up for you. It ends up with a bunch of pods, some of which workers other for scheduling.

Redash: No official helm chart, however, the community one is part of the main redash repo, hence closely evolves with redash itself. Dependencies are redis and a database that are also spinnable through the chart. No extra work is needed for the alert and schedulers as those are spun by default, however, you again end up with a bunch of worker pods.

Verdict:

Superset > Metabase > Redash

Data sources

We are heavy users of AWS Athena here at Vortexa. Athena is a distributed query service and its done wonders for us when it comes to querying our data lake. It can pretty much scan tens of GBs in seconds. So the ability to add plugins, or the out of the box setup has been greatly appreciated.

Metabase: No official support for Athena, so we had to enable a community plugin within the helm config.

Superset: A rich selection of data sources. Plugins for the non-standard ones need be installed within the chart to be enabled. The process is simple and done as python packages.

Redash: Rich selection of data sources out the box. No tweaking required

Verdict:

Redash > Superset > Metabase

Alerting and reporting

Alerting has been another one of those really important requirements we’ve had. As mentioned before we’ve been doing pretty well when it comes to monitoring our live systems and overall infrastructure health (courtesy of New Relic). What we’ve really wanted lately is the ability to monitor our data and business outcomes.

For example, we have processes in place that take care of extracting data from Excel workbooks. These are best effort and if one sheet is badly formatted the process doesn’t need to fail. However, we still want to record that as something for us to follow up on. Alerting or generating a daily report can help with that. (If you want to see how we handle messy spreadsheets check out our open source refinery project)

For this part we have mostly been interested by the slack integration and the effort required to make that work.

Metabase: UI driven. Give it a slack token and works like a charm. Easy and intuitive setup.

Superset: Requires helm tweaking to spin up its scheduler workers.

Redash: Schedulers up by default, so UI setup mostly. The UI is buggy at places. For example once you choose a Query you can’t change that without refreshing the page.

Redash Alerting Window

Verdict:

Metabase >Redash > Superset

UX and Dashboarding experience

By far the most important criteria, also the hardest to compare as each tool comes with its own strengths. What we are looking for here is a smooth query to chart experience, cross filtering, query sharing and general usability. Basically all the things that are hard to describe, but you know feels right when you use a tool.

Metabase: Intuitive and simple UI. You can write queries or build them with some help from the tool and add visualizations on top. The charting collection isn’t vast but contains a lot of the general useful charts. You can add in filters that are quite smart and based of column values. These are baked into the SQL you write yourself. A big seller is also the ability to share links to a query you’ve been playing around with, without the need to save it as a dataset. Metabase also allows you to add click events to your charts, which we’ve found to be quite powerful. With the click events you can filter the whole dashboard or even redirect to an external address. You can bucket your queries and dashboards into collections that have their own permissions.

Image from official Metabase blog

Superset: Rich UI. Supersets strength comes from its charting capabilities. There is a huge choice of visualizations and they are quite configurable. This does make the learning curve steeper and at times can feel exhausting getting something out. Superset drives data exploration through clicking and selecting fields to group by, or metrics to display. This is what allows the rich the charting, however, can feel limiting when it comes to transforming your SQL into a visualization.

Image from official Superset gallery

You kind of need to know what you want to visualize before you start doing it, because in order to use adhoc queries you first need to create a dataset and save it. You cannot easily write SQL and try out Visualizations to see what looks good. There is a disconnect between the data exploration and visualizing part. On the bright side, filtering works magically because you have your data already saved as a dataset, i.e. adding a filter in your dashboard is a click away.

Image from official Superset gallery

Redash: Simple and intuitive UI. A rich selection of charts is available as well. The experience is similar to Metabase, but where it differs is the filtering capabilities. In order to filter by the contents of the data itself you first need to write a separate query that will serve as the source dataset. Unfortunately it also doesn’t support having an empty filter so you need to bake that logic in yourself when writing filters. Collections are not a thing. Sharing queries requires you saving them first. Cross filtering is not intuitive.

Image from official Redash docs

Verdict:

Metabase > Superset > Redash

The final choice

Out of the three tools the least mature one seems to be Redash. What drove the choice for us between Metabase and Superset was the ability to easily share queries through the tool, the intuitive UX and rich dashboarding experience. As such we’ve decided to give Metabase a proper go and will be trying it out in production. In future articles we’ll explore our experience with the tool.

The company

Vortexa is an energy analytics company using AI and deep industry expertise to provide the most complete view of crude oil, refined products, LPG and LNG flows globally. At Vortexa we address exciting challenges in innovative ways, day-in, day-out. Sounds like you? We’re hiring.

--

--