How to visualize your data at no cost?

Vitor Prado
Sinch Blog
Published in
4 min readOct 15, 2020
Democratizing your Data Lake

Today on the market there is a wide variety of data visualization tools at our disposal, each one with its differential and specific characteristics. Within so many options, we at Wavy Global have chosen to use Metabase.

Well, before sharing the reasons why we decided to use Metabase, I think it makes more sense to start by giving an introduction to the tool.

What is Metabase?

“THE SIMPLEST, FASTEST WAY TO GET BUSINESS INTELLIGENCE AND ANALYTICS TO EVERYONE IN YOUR COMPANY”- METABASE

For us, the main highlights of Metabase in relation to other visualization tools are:

  • Free Open Source (GitHub)
  • Does not require technical knowledge in SQL

As it is an Open Source tool, it is much easier to use it within the company, without depending on a budget approval by the finance team. In addition, the tool is developed and maintained by the community, so the more people using it, the more complete the product becomes.

How it works?

To start using just follow the steps:

  • Setup in 5 minutes
  • Add connections to your databases
  • Create the desired questions (queries)
  • Create a dashboard and add your questions
  • Create / maintain data governance (access control and permissions)

Okay, after that you will already have your first dashboard to visualize your data.

Fast isn’t it? :)

Dashboards (like this one) are easy to build, share, and explore.

How do we use it?

We offer it as a simpler and more intuitive alternative to consult the Data Lake and Data Warehouse, giving teams more autonomy.

Currently Metabase helps a lot with fast and dynamic deliveries, be it a simple analysis or even a dashboard. The goal is to be an environment that offers self-service data, where everyone can make any query, even without knowing SQL.

Why do we use it?

Access control -> Governance and GDPR/LGPD

Data governance has always been an important issue, but now with the LGPD (General Data Protection Law) and GDPR (General Data Protection Regulation) in place it has become an even more critical issue. Metabase even in the free plan provides a very granular and complete access control.

Connection to multiple datastores -> BigQuery and PostgreSQL

Currently, Metabase connects with more than 10 different databases, being SQL and NoSQL, in our case specifically the connection with BigQuery and PostgreSQL was important.

Pulses / MetaBot -> Everyone wants to receive alerts

Even the metabase giving more autonomy and in a way “Democratizing the data”. Offering the possibility to create automated alerts is even better. MetaBot is the integration with email or slack to send the questions that were created in the metabase.

Data catalog -> Integrated in the tool

The difference here is that even in the free plan, Metabase has a good integrated data catalog, facilitating data exploration and avoiding the need to have a data catalog in a separate tool.

Open source -> No license concerns

In addition to not having the cost, we do not need to worry about the number of licenses and the number of users who have access to the tool, simplifying the data sharing process.

Creating questions interactively -> Frequently asked questions about the business

Metabase makes it possible to create questions interactively, that is, in a few clicks and following the steps, you can query a database without writing a single line of SQL.

Creating questions interactively

Lessons learned from Metabase

And what were the lessons learned after using the tool in production for more than 1 year?

  • Not having to worry about the number of licenses is fine, but it can get out of hand. We noticed that we had several users who had not accessed the tool for a long time.
  • More people with access to the data generates greater autonomy, but it can generate discrepancies. It is risky to present two different analyzes under the same data with different results, this can generate a lack of confidence and consequently loss of credibility of the data team.
  • It is important to clean the metabase regularly. After a few months, we had hundreds of questions created and most of them were no longer used or the project had died.

I leave as a suggestion the recording of my lecture at SEMCOMP 2020 on “How to visualize your data with open source tools”

#data #dataengineering #metabase #wavyglobal

--

--