Creating interactive dashboards in AWS

Carlos Cruz
NicaSource
Published in
5 min readSep 30, 2022

Decision-making is a crucial activity that happens at every company, level, or industry. In this fast-paced and highly competitive environment, some companies rise above others due to this same ability to make decisions.

Traditional companies usually base their decisions on gut feelings and incomplete (or even false) information. The previous approach was acceptable (to a certain degree) in an era in which data wasn’t as readily available as it is now.

Meanwhile, modern companies combine the experience, creativity, and knowledge they have accumulated with relevant data to create or find potential new business opportunities. Data is the oil equivalent to past generations in our current digital economy.

Companies who learn to integrate data successfully into their business will have a clear advantage over the rest. In this article, we will finish our journey of building a PoC (Proof of Concept) Data Lake.

Getting started with Data Lakes in AWS

We will use all the data we have acquired from previous articles and create valuable insights using data models with a tool for interactive dashboard creation, QuickSight.

Data Processing in AWS

As always, here’s the GitHub repository with the resources and a more beginner-friendly guide:

https://github.com/carloscruzns/datalakepoc#datalakepoc

Testing Athena

In every step of the ETL process, and even before starting, it is worth exploring and understanding the data we are working with. There are many ways to achieve this, but a very convenient one is using Athena to query our data in the different layers of our Data Lake. If you are familiar with SQL, then you know there are many possibilities. Let’s get a sample of data in our diamond layer.

Great, we have a good data model that we can start using, but there are a few fields that shouldn’t be there, and we can improve some others’ names. The right way to do it would be to update our code and rerun the ETL process.

The not-so-right way to do it, but more practical, is to create a view with these easy changes and work with that going forward. I’m not encouraging technical debt, but in urgent cases, changes might be required in a short response time, and this approach provides us with enough flexibility. After all, engineering is managing multiple moving parts to achieve the desired result by keeping technical debt in check.

Setting up QuickSight

After improving our data model, we can start focusing on generating value with it. As our last stop in this journey, we will set up, design, and publish an interactive dashboard using AWSQuickSight. First, we need to sign up with the standard tier (it offers a 30-day free trial) and select the following:

Setting up an account:

  • Authentication method: Use IAM federated entities & Quicksight’s managed users
  • Region: The same as our previous resources
  • QuickSight account name: Any
  • Notification address: Any

Setting up S3:

  • Make sure the S3 buckets linked to the QuickSight account are selected
  • Select the three buckets created for our data lake layers

QuickSight access to AWS services:

  • Check IAM
  • Check S3
  • Check Amazon Athena

And with that, we are done with the initial setup for QuickSight. There are a few sample datasets in case you would like to explore what this service can do. In our case, we will use the diamonds we have created from the ETL process and, for that, we do the following:

  • Click on new Dataset;
  • We pick Athena; we will use this service as our data source;
  • Data source name: Athena-primary, and we leave primary as the workgroup;
  • We scroll to the bottom, and there is a list of existing data sources. We click on the one we have created;
  • We select our diamond Sakila database and the view we created previously. Click on “Edit/Preview Data”.

If everything is loaded correctly, we should see similar data to the one we have seen with Athena. Set the dataset name ( anyone you consider suitable) and click on “Publish & Visualize”.

We have an empty canvas at our disposal. With this, we can start creating visualizations with our recently added dataset. They might give us valuable insights. The learning curve for QuickSight isn’t daunting at all. So, start exploring the tool and checking what ideas come to your mind.

When you are done, click on share and export it as a dashboard. You can see analyses as your development environment and dashboards as either your end result or your final product (users get access to this one).

This is an example of a simple dashboard that could be done usingQuickSight. We can answer questions like:

  • What is the total amount of revenue?
  • How many films were offered?
  • Which films generated the most amount of revenue?
  • Which rating generated the most amount of revenue?
  • And more…

Thanks!

And that’s all there is to it!

If you have followed every step and ended with a data model and a strategy to present valuable insights, I would deem the journey a success. Of course, there is much more to consider when implementing a data lake or a data project in a production environment. We must account for partitioning data, data quality, security, processing big data, automating data pipelines, data storytelling, etc. (some of which were mentioned briefly or not at all).

The objective of this saga of articles was to serve as a starting point for curious people about the world of data analytics since I consider it difficult to find a use case from start to finish. Perhaps you can expand on them in your own articles in the future.

If you decide to follow this path, I hope to see you around the road, my dear fellows!

--

--