AWS Data Lake: Build Your Business Intelligence System.

Ohad Gazit
Jun 21, 2020 · 5 min read
AWS Quicksight BI tool


Most business intelligence projects comes to help management view summarized data from company’s sources (applications etc.) and make better decisions as managers and decision makers.


  • Collect the data to AWS S3 bucket using cron job script (not AWS Kinesis-Firehose or Kinesis-Data-Streams).
  • Invoke AWS server-less lambda for indexing metadata of the data-lake.
  • Creating a new database using AWS GLUE Crawler.
  • Use AWS Athena to query and add new tables to the database and analyze data in the lake.
  • AWS Quicksight business intelligence (BI) application to perform data analysis and visualization.

Collect Data and Migrate to AWS S3

In this article we discuss a database that you cannot dynamically reach, due to private medical data it holds. This means that it’s not allowed to give external entities the DB credentials, moreover, we can’t POST our data to Amazon Kinesis streaming or Amazon Kinesis Firehose for capturing the data into the data lake set. The workaround for me was to build a passive, not event driven, cron job that triggers the applications servers (django in my case) to send an anonymized json file to AWS S3.

Indexed Metadata Data-Lake.

At this point we have a json file exported from the servers and we want to build a data lake. How do we do that ?

AWS GLUE Crawler.

“AWS Glue is a fully managed ETL (extract, transform, and load) service that makes it simple and cost-effective to categorize your data, clean it, enrich it, and move it reliably between various data stores and data streams.” from AWS docs.

AWS Athena, Query The Data Lake.

Amazon Athena is an interactive query service that makes it easy to analyze data. First, we need to connect to the crawler in AWS Glue in order to retrieve database information, this is done in the data source tab (see below).

Migrate To Quicksight AWS BI tool.

“Amazon QuickSight is a fast, cloud-powered business intelligence service that makes it easy to deliver insights to everyone in your organization.”

  1. Pick a table you wish to view graphically:


We just saw how to create a data lake and made first steps in analyzing the data for BI purpose. All these tools are powerful and are the basic for BI, AI and ML. A few years ago, who would have guessed it could be done without one real computer or one SW license.

  • AWS permissions
  • Schedules / Dynamic data triggers.
  • Medical data anonymization.

The Startup

Get smarter at building your thing. Join The Startup’s +786K followers.

Sign up for Top 10 Stories

By The Startup

Get smarter at building your thing. Subscribe to receive The Startup's top 10 most read stories — delivered straight into your inbox, once a week. Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

Ohad Gazit

Written by

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +786K followers.

Ohad Gazit

Written by

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +786K followers.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store