How we deal with a lot of data using N.R.T. as B.I.

Rafael Miceli
BTG Pactual Developers
4 min readJun 14, 2020

--

In the area I work, a few of the users of our deliveries are used to work with data, mostly Power B.I. With that, one of the first challenges we faced was the strategies we would approach to store our Data and easily create visualizations to our users.

The Problem with a few B.I. tools

Tools such as Power B.I. has a few disadvantages. One is that needs to be pooling data very time wants to update its dashboards, this makes our users wait minutes to refresh their dashboards, and depending on the database you are using it can even incur extra charges, also consolidate data from different solutions is harder.

Because of these drawbacks, we decided to avoid relying on Power BI only.

N.R.T.s to help out

N.R.T. analytics (NRT stands for Near Real-Time), are tools that help you to make queries with results in Near Real-Time as the name implies, using NRT analytics tools we generally push data into it, so it’s a lot faster for our users to see their dashboards updated.

Before using any NRT analytics tool we need to evaluate our options, and when talking about NRT analytics we have to major competitors: Splunk and ELK stack. Each one with advantages for our scenario.

The advantages we saw in the ELK stack was:

  • Open Source
  • Easy to create basic Visualizations and Dashboards
  • Easy to create queries
  • Support for Vega/Vega-Lite

And the Advantages we saw in Splunk was:

  • Far more user management options
  • Support included (as Splunk is paid)
  • A lot of plugins
  • Easy to import Data

Explaining ELK Advantages

Explaining the advantages above, ElasticSearch has a vibrant community and is increasingly fast with new functionalities. One that really shines is the beginning to support Vega/Vega-Lite, which is a declarative format for creating, saving, and sharing visualization designs using JSON, also create visualizations using the KQL is really easy, which resembles a lot SQL. Another thing is that deploy an ELK cluster with docker for testing or mocking, or even to use for a silly searcher (we have done it a lot, with significant benefits) is really great.

Challenges with ELK

But ELK has its challenges, and if you are not going to pay for a prepared platform like logz then you need to have a great infrastructure team to maintain your cluster, another thing is that to insert data into ELK correctly might be harder than you think, and not to forget that you don’t have support with ELK (again, if you are not paying for a platform)

Explaining Splunk Advantages

With Splunk you can only have it paying for its license, the advantages are that we can customize the profile of our users, so they can see only what they are allowed to see, Splunk also has a lot of plugins, we use AlertManager for example, another thing is the Splunk web interface, which is complete, so you have a lot of features in it, and as said above if you need to import data, as an excel file, is really easy.

Challenges with Splunk

But as ELK, Splunk also has its struggles, we noticed that for our users to create Query in Splunk is harder as the SPL is not as friendly as KQL.

In the end

We ended up using Splunk, because of the benefits listed above, and most important because we had a team member with Splunk knowledge and all the time helping our users teaching them about the tool, our users adopted it. This was a crucial characteristic that made the success, which I recommend independent the NRT you choose.

Because if you have only the output but not the outcome, then you have a tool not being used by no one, and the effort was in vain.

--

--