Visualization Libraries in the Age of Advanced Self-Service BI Tools

When it still makes sense to directly rely on visualization libraries rather then using data visualization tools

Patrick Pichler
Creative Data
4 min readJul 22, 2020

--

Photo by Lukas Blazek on Unsplash

Introduction

There basically coexist two ways of visualizing data. The first one is ready-to-use visualization tools and the second one are purpose-built programming libraries for data visualization. Both approaches have continuously been developed further over time, either from their corresponding software vendors or open-source communities. At this point, it has to be mentioned that most visualization tools actually rely on visualization libraries for drawing charts and diagrams generically. However, especially the wide range of visualization tools on today’s market including recent enhancements towards integrating advanced analytics features, might raise the question: Why reinventing the wheel and struggling with coding if you could achieve the same, similar or even better results with just a few clicks?

Advanced Self-service BI Tools

To answer the questions, let’s start with the capabilities of advanced self-service BI tools such as Power BI, Tableau or Qlikview, just to name a few. They all provide fantastic drag and drop experience for data visualization. Most of them allow you to easily integrate multiple data sources by using plenty of built-in connectivity options. Followed by modeling, manipulating, and shaping the data to your needs. As already mentioned, the last developments even integrate built-in Machine Learning (ML) and Artificial Intelligence (AI) capabilities into the tools. This has often been one of the main reasons to rely directly on Python or R programming libraries. However, also some of those are meanwhile integrated into most tools and allow you to code within the tool. So, why relying solely on visualization libraries and coding beyond such tools?

Visualization Libraries

When it comes to visualization libraries, then JavaScript (e.g. D3), Python (e.g. Plotly) and R (e.g. ggplot2) have probably the most to offer regarding this. The latter two are state of the art anyway when it comes to programming languages oriented towards advanced data analytics. So, let’s see when it makes indeed still sense to use or learn one of them.

No limits

The first and probably most important aspect of working with visualization libraries is that there are no limitations of proprietary frameworks. You can literally integrate and visualize whatever you want with just a few lines of code thanks to the great libraries. Visualization tools on the other hand, often come with their limitations regarding supported libraries and/or amount of data. You have full control over data including the look and feel of your application or report. For instance, here is an interactive Python based Dash app that is styled to look like a PDF report:

Dash App Sample

Self-made dashboard apps

Visualization libraries and their open source frameworks even allow you to build your own interactive dashboard web app, such as Dash or Bokeh for Python and Shiny in case of R. Visualization tools are really great and flexible for analyzing existing datasets where you can slice and dice data or drill down/up/across, but they are actually weak when it comes to calculations based on real user inputs.

Dash App Sample

Embedding dashboards

This leads me to the next argument in favor of relying on visualization libraries. If you don’t want to build your own dashboard app, you can make visualizations truly part of a web application/site or at least let it seem to be a real part of it with a unified look and feel. Of course, on the other hand, you could argue that embedding dashboard from third party visualization tools natively support dynamic filtering, cross-highlighting and security which takes time to implement. However, in most cases you might just want to present information. Apart from that, technical licenses for embedding third party products can get quite expensive and the utilization is kind of hard to estimate.

Data exploration

Last but not least, there has to be mentioned interactive data exploration as another typical application of visualization libraries. This is mostly done in a Notebook format these days such as Jupyter. Notebook environments allow you to combine live code execution with textual comments and data visualization. The resulting content then constitutes kind of iterative conversation between researchers and data which helps to follow and understand conclusions.

Plotly Jupyter Sample

Conclusion

Long story short, if you want to focus solely on presenting data, making it interactive and include some basic AI or ML features, then advanced self-service BI tools are the way to go. There are also amazing open-source tools available for just visualizing data such as Apache Superset, Metabase or Redash. The latter has recently been acquired by Databricks and is now part of their Notebook environment. Otherwise, if you are involved in data science projects or you want to do some self-made niche things with your data, then you will likely find yourself coding by relying on visualization libraries.

--

--

Patrick Pichler
Creative Data

Promoting sustainable data and AI strategies through open data architectures.