Building Data Pipeline to Analyze Clickstream Data with AWS

Farzanajuthi
2 min readNov 16, 2023

--

Cafe Website

When I am working on this project (Cloud Data Pipeline Builder [46997]), I have found it very interesting and also I faced many issues to solve this project. So, I have decided to document it that it can be useful to anyone who will do this lab in future.

I have found this lab in AWS cloud academy. As it is a big project, so I have divided into multiple steps as anyone can just focus in a specific area.

Here in this project, there is a cafe website where visitors can see the menu of products and also purchase from this products.

The owner of this website wants to see -

  1. Number of visitors who can access the menu
  2. Number of visitors who purchase
  3. Number of visitors who access menu but not purchase
  4. Visitors from which regions visit the most
  5. Visitors from which regions order the most

To get this insight, following AWS services are used -

  • CloudWatch log group
  • CloudWatch Logs Insights
  • CloudWatch dashboard
  • AWS Cloud9 environment (Amazon EC2) instance for the web server
  • AWS Identity and Access Management (IAM) role
  • Amazon S3

In following links I have described steps by steps process to complete this full project-

  1. Analyzing the website and confirming weblog data
  2. Installing the CloudWatch agent and creating the configuration file
  3. Testing the CloudWatch agent
  4. Using the simulated log and ensuring that CloudWatch receives the entries
  5. Using CloudWatch Logs Insights for analysis
  6. Adjusting the pipeline to deliver new insights

If you find this post helpful, please give a clap in this post and follow me in medium and let’s connect in linked in.

--

--

Farzanajuthi

I am an AWS community builder. I have passed AWS certified solution architect (CO3) exam). I love serverless technology and also share knowledge with others.