Solving the hurdles of embedding the analytical dashboards into transactional web application

Abubakkar Siddiq MOHAMED FAROOK
DBS Tech Blog
Published in
10 min readNov 17, 2021

GENE is the online transaction processing (OLTP) application of Genesis, a comprehensive programme that streamlines both platform delivery and financial management to seamlessly and predictably enable DBS to unlock the full potential and value of Agile.

I am Abubakkar Siddiq, a Solution Architect for the MOT (Middle Office Technology) tech unit within DBS Bank Ltd. My job is to help solve challenging business problems with innovative technology solutions, aligning business needs to IT capabilities.

In this article, I will be going through the roller-coaster journey that we took when embedding OLAP-based BI (Business Intelligence) dashboards within our GENE OLTP-based web application. I will also be discussing some of the challenges we faced during this journey and how we overcame them.

Our Evaluation and Selection of the OLAP Dashboard Development Tool

In order to enable our business users better visualisation of the data, we were required to build more than 30 dashboards within the GENE application, using diverse types of visual tools like charts, plots, graphs, maps, and data tables. Being an OLTP application, we had already built a few dashboards previously with Angular for the front-end and java-based Springboot for the back end.

During these dashboard developments using Angular, we faced many challenges when using a regular OLTP based development approach, including but not limited to:

1. Exceptionally long SDLC (System Development Life Cycle) for the development and testing of the dashboards.

2. Lack of agility in refining visualisations due to the long time and effort required to build and test any changes in the presentation.

With these restrictions, the effort necessary for building a single dashboard using traditional OLTP techniques was akin to 2 developers expending 2-weeks’ effort. This was not only inefficient, but further imposed more challenges on the our team:

1. Our long-term strategy was to enable business users to build their own dashboard insight visualisations and quicker response actions using self-servicing BI (Business Intelligence) which could not be achieved using the OLTP-based approach.

2. Developing more than 30 Dashboards using the OLTP-based approach within 3-months was an unrealistic timeline based on the above estimations.

In order to overcome the limitations and challenges presented by the current OLTP-based approach for dashboard development, we analysed all the OLAP-based BI tools used within organisation and finally chose Tableau BI tool to help us fast-track our development efforts. It already had a large footprint within in our organisation and met most of the requirements of our programme, including a solution for the two challenges faced by the GENE team.

Our Journey and Challenges

After our decision to use OLAP dashboards, we started our journey of integrating these dashboards within the Web UI (User Interface) to offer a harmonious user experience. Though OLAP dashboards were commonly used within the organisation, no other team has ever tried to embed the OLAP dashboards within the OLTP web application. The development and Integration of OLAP-based dashboards into the OLTP web applications posed their own risk and challenges, as highlighted below.

1. Integrating OLAP-based dashboard within Web UI using Single Sign On (SSO)

The foremost challenge faced with OLAP-based dashboard development was integrating these dashboards into the Web UI. Seeing as a double login process (1st time in our web app and 2nd time in OLAP dashboard UI login screen) would not present a great user experience, we wanted to achieve a Single Sign On (SSO) login between OLTP web application and OLAP dashboards. The OLAP dashboard server supported the below listed options for SSO:

a. Security Assertion Markup Language (SAML)

b. Kerber OS (Operating System)

c. OpenID

d. Trusted Authentication

e. Mutual SSL (Secured Socket Layer)

With each method having its own challenges and complications, we sought to find the most suitable and secured choice and decided on trusted authentication.

2. Data source filters

We implemented 13 diverse types of Quick Filters with different levels of hierarchies such as global, secondary, and ternary filters.

Figure 1

These quick filters are used to slice and dice the data present in the each and every dashboards. While global filters are common across all of the dashboards, secondary filter sections and ternary filters are only available on several of them, with each ternary filter being very specific to each dashboard.

The OLAP dashboards read a portion of the 500,000 records available on the DB (database) table and combine it with data from other related tables before presenting the output as an insight through the dashboards. These dashboards, however, only required up to a maximum of 10,000 records from the DB table to do this. (Figure 1).

The issue was that with all the filters within the dashboard being kept as “Quick Filters” (Refer Figure 2), the Structured Query Language (SQL) generated by the OLAP dashboard servers was not be able to consider the filtering conditions originally defined on the DB source table. Therefore, the query generated by the Quick Filters read and scanned through all 500,000 records in the data, rendering each and every quick filter, rather than just the 10,000 that was needed. As a basic rule of a database querying, reading more data than required from table will lead to degradation of the performance of the database query.

To overcome this, we overrode the default behaviour of the OLAP dashboard servers by defining all 13 filter conditions on the data source itself, allowing the query generated by the OLAP dashboard server to read only the required 10,000 records. With this, the query was more performant and loaded faster than earlier versions.

Figure 2

3. Externalising Quick Filters

Despite applying filters on the data source table and improving the performance of the queries generated by the dashboards, we still did not have complete control over the queries produced by the dashboard Quick Filters. Therefore, we decided to move global and secondary filters completely outside the OLAP dashboard and placed them into our web app UI instead. (Figure 3) We did this while keeping ternary filters within their respective dashboards.

Figure 3

This approach inherently presented two new challenges:

1. Firstly, after moving all global and secondary filters into dependent dropdowns within our Web UI, we found that options put forth in the dropdown depended on values selected in previous dropdowns.

We discovered that the JavaScript Application Programming Interface (API) client library that we used to build and model the dropdown behaviour applied the filtering conditions in a very inefficient way. It filtered the data using each filter one by one, and as changing each filter for a single dropdown value took nearly 5 seconds each, all 13 filters in the data took more than a minute for us to refresh.

To work around this limitation, we changed the approach to append all the filter conditions on the Uniform Resource Locator (URL) as parameters. The URL parameters on the dashboard link allowed us to apply the filter directly to the database table columns for all the 13 dropdowns.

https://host:port/dashboard1?col1=val1&col2=val2,val3

However, a condition was required for this to work, and it was that col1 and col2 (URL Parameters) needed to match with the column name in the data source table.

2. Secondly, with the dependent dropdowns, we needed to make an API call to the back end which in turn would make a call to the database table in order to fetch a single dropdown value.

Even after applying all the conditions of the earlier dropdowns and only considering 20,000 records, the query tasked to find the distinct dropdown option was still heavy weighted.

select distinct coln5 from table1 where col1=’val1’ and col2=’val2’ and col3=’val3’ and col4 in (‘val4’, ‘val5’); -

still a heavy weight due to the 10K records and 150 + columns in our table

To circumvent this, we materialised the data required for the dropdown into a separate table and queried that table to construct the filter dropdowns instead of querying the original source table containing 10,000 records and over a hundred columns.

Create table dropdown_materialized_table as Select distinct col1, col2, col3, col4 from table1 where conditions; -

this materialised table contained only 500 combinations of distinct records from the original table, allowing the query to get the distinct columns from this table faster

Select distinct col1 from dropdown_materialized_table where col1=’val1’ and col2=’val2’ and col3=’val3’ and col4 in (‘val4’, ‘val5’); -

As we have only 500 records compared to 10000 records, this query performed 1,000 times better than earlier one.

4. Near real-time dashboards

After taking all global and secondary filters out of the dashboards, the dashboards operated much faster than the preliminary versions of themselves. The average loading time of all our dashboards improved from more than 4 minutes down to 40 seconds in a direct data load from the database even with a cache miss within the OLAP dashboard server. With this, users were now able to apply global and secondary filters multiple times to visualise different slices and dices of data within the dashboards.

However, the 40-second response time for loading dashboards was still not acceptable.

To further improve on the performance optimisation journey of our dashboards, we decided to investigate all table schema and SQL being generated to fetch the data for our dashboards. All our dashboards had similar data source schema and relationships. Dashboard data source schema consists of a primary table that has aggregated data and joins with a set of multiple lookups and secondary tables (refer to figure 1).

To check all the SQLs being generated for rendering dashboards, we started to record the performance of our OLAP-based dashboard and found that when combining primary, secondary and lookup tables using join relationships in the OLAP dashboard, the resulting table goes into cartesian product due to missing joining conditions between these tables. This led to more numbers rows being read from the database table with cartesian product instead of the minimum required rows.

It became quite challenging to handle the optimisation at this point as we had

1. No control over SQL being generated by the OLAP dashboard server

2. Missing joining conditions could potentially reduce cartesian product

We then explored the possibility of enabling a scheduled data refresh in the OLAP dashboard server to overcome these issues of reading the data directly from the database. However, a scheduled refresh is costly in terms of OLAP dashboard server resources, and we had a restriction of refreshing no more than every 8 hours, which was not acceptable.

Our requirements called for more near real-time data in the OLAP dashboard, so we created a de-normalised table structure combining all the primary, lookup and secondary tables as shown below

Create or replace denor_table (

Tab1_col1 varchar(10), Tab1_col2 varchar(10), Tab1_col13 decimal(10,9), Tab1_col4 decimal(10,9), Tab2_col1 varchar(10), Tab2_col2 varchar(10), Tab2_col3 decimal(10, 9), Tab3_col1 varchar(10)

);

Insert into denor_table as select col1 as tab1_col1, col2 as tab1_col2 , col3 as tab1_col3 , col4 as tab1_col4 , null as Tab2_col1, null as Tab2_col2, null as Tab2_col3, null as Tab3_col1 from table1;

Insert into denor_table as select col5 as tab1_col1, col6 as tab1_col2 , null as tab1_col3 , null as tab1_col4 , col7 as Tab2_col1, col8 as Tab2_col2, col9 as Tab2_col3, null as Tab3_col1 from table2;

Insert into denor_table as select col10 as tab1_col1, col11 as tab1_col2 , null as tab1_col3 , null as tab1_col4 , null as Tab2_col1, null as Tab2_col2, null as Tab2_col3, col12 as Tab3_col1 from table3;

Select tab1_col1, tab1_col2, sum (Tab1_col13) from denor_table group by tab1_col1, tab1_col2;

A De-normalised table (DN) introduces the below changes in the same way we build dashboards

1. Only contains the 10,000 records which are needed for the dashboard rather than the 500,000 records in our original primary table

2. It does not need to join with any other table when rendering its source data, instead we join all the data and pre-populate them in it

3. OLAP dashboards are directly pointing to these DN table instead of individual tables and join them using relationships at the OLAP dashboard server’s end

4. This DN is refreshed every half an hour, achieving near real-time data without the 8-hour delay

With this optimisation change, we achieved a load time of 5 to 8 seconds even when the data was being fetched directly from DB during cache misses within the OLAP server.

Our Biggest Challenge

Throughout our journey, the biggest challenge we faced was tuning the performance of the dashboards to meet our users’ expectations. We had to ensure that the information they required did not take too long to be delivered or was too difficult to access. Despite the difficulties we faced during the 3-month timeline and our goal of delivering over 30 dashboards, we persevered and the final development and Integration of the OLAP dashboard acted as our light at end of the tunnel

Our roller-coaster journey finally ended on a high note, with the delivery of an insightful and highly performant dashboard, incurring very satisfied customers within the community who use these dashboards as part of their day-to-day work activities.

--

--