Building Data Pipeline to Analyze Clickstream Data with AWS
When I am working on this project (Cloud Data Pipeline Builder [46997]), I have found it very interesting and also I faced many issues to solve this project. So, I have decided to document it that it can be useful to anyone who will do this lab in future.
I have found this lab in AWS cloud academy. As it is a big project, so I have divided into multiple steps as anyone can just focus in a specific area.
Here in this project, there is a cafe website where visitors can see the menu of products and also purchase from this products.
The owner of this website wants to see -
- Number of visitors who can access the menu
- Number of visitors who purchase
- Number of visitors who access menu but not purchase
- Visitors from which regions visit the most
- Visitors from which regions order the most
To get this insight, following AWS services are used -
- CloudWatch log group
- CloudWatch Logs Insights
- CloudWatch dashboard
- AWS Cloud9 environment (Amazon EC2) instance for the web server
- AWS Identity and Access Management (IAM) role
- Amazon S3
In following links I have described steps by steps process to complete this full project-
- Analyzing the website and confirming weblog data
- Installing the CloudWatch agent and creating the configuration file
- Testing the CloudWatch agent
- Using the simulated log and ensuring that CloudWatch receives the entries
- Using CloudWatch Logs Insights for analysis
- Adjusting the pipeline to deliver new insights
If you find this post helpful, please give a clap in this post and follow me in medium and let’s connect in linked in.