How to use CloudWatch log insight for analysis

Farzanajuthi
5 min readNov 21, 2023

--

This is the sixth phase of this project. This phase is a combination of three tasks. I will show every task step by step.

You have already seen in fifth phase that your simulated data is in CloudWatch now. So, you can get insight from these data by using CloudWatch log insight.

First go to CloudWatch console and then choose log insight (red marked 1) from left menu bar.

Then you have to choose an option from drop down (red marked 2). In your case it will be “apache/access”.

After that select a time (red marked 3) when you load your data into CloudWatch. If it is 2 hr ago, you have to select “last 2 hour”. So, choose the time depends on your data upload time.

Then you have to give query in query box (red marked 4) and click on “Run query” button. You will see results based on your query.

After that, you can save your query from “Action” drop down. You have to select “Save as” option from there.

You have to follow these process for next few queries.

Log insight console

Task 1: Determining the number of visitors who accessed the menu

In query section, you have to give this query and do all previously stated processes —

 fields @timestamp, remoteIP
| filter request = "/cafe/menu.php"
| count(remoteIP) as Visitors by @timestamp
| sort @timestamp asc

You will see a result like following image. You can see 1000 records out of 49784 matched. You can also see that this gives results out of 187,053 records.

menu-visitors query result

Then click on “Actions” drop down and choose “Save as” option. Then you will get a form like following image, you have to select “Create new” for creating new folder and giving all information like image, click on “Save” button—

Save query result

To store this result into a file named phase6-results.txt, go to cloud9 console and open its editor. Then type following commands one after another —

log_group_name="apache/access"
query_string='fields @timestamp, remoteIP | filter request = "/cafe/menu.php" | count(remoteIP) as Visitors by @timestamp | sort @timestamp asc'
query_id=$(aws logs start-query \
--log-group-name $log_group_name \
--start-time $(date -u -d '4 day ago' '+%s') \
--end-time $(date -u '+%s') \
--query-string "$query_string" \
--output text --query 'queryId')
while true; do
status=$(aws logs get-query-results --query-id $query_id --output text --query 'status')
if [ "$status" == "Complete" ]; then
break
elif [ "$status" == "Failed" ] || [ "$status" == "Cancelled" ]; then
echo "Query failed or was cancelled. Exiting."
exit 1
else
sleep 5
fi
done
aws logs get-query-results --query-id $query_id --output text --query 'results' > phase6-results.txt

After running all these commands, you can see the file saved in left menu.

phase6-results.txt

Task 2: Determining the number of visitors who made a purchase

You have to follow all processes that you did in Task 1. Just change the query with this query —

fields @timestamp, remoteIP
| filter request = "/cafe/processOrder.php"
| count(remoteIP) as Visitors by @timestamp
| sort @timestamp asc

Then save it as a separate query, named purchasersin “non-geo-results” folder.

Task 3: Determining the number of visitors who accessed the menu but didn’t make a purchase

In this step you have to got to cloud9 editor again and run following commands one after another —

log_group_name="apache/access"
query_id_1=$(aws logs start-query \
--log-group-name $log_group_name \
--start-time $(date -u -d '4 days ago' '+%s') \
--end-time $(date -u '+%s') \
--query-string 'fields Visitors | filter request = "/cafe/menu.php" | count(remoteIP) as Visitors' \
--output text --query 'queryId')
while true; do
status=$(aws logs get-query-results --query-id $query_id_1 --output text --query 'status')
if [ "$status" == "Complete" ]; then
break
elif [ "$status" == "Failed" ] || [ "$status" == "Cancelled" ]; then
echo "Query failed or was cancelled. Exiting."
exit 1
else
sleep 5
fi
done
aws logs get-query-results --query-id $query_id_1 --output text --query 'results' > query1_results.txt
query_id_2=$(aws logs start-query \
--log-group-name $log_group_name \
--start-time $(date -u -d '4 days ago' '+%s') \
--end-time $(date -u '+%s') \
--query-string 'fields Visitors | filter request = "/cafe/processOrder.php" | count(remoteIP) as Visitors' \
--output text --query 'queryId')
while true; do
status=$(aws logs get-query-results --query-id $query_id_2 --output text --query 'status')
if [ "$status" == "Complete" ]; then
break
elif [ "$status" == "Failed" ] || [ "$status" == "Cancelled" ]; then
echo "Query failed or was cancelled. Exiting."
exit 1
else
sleep 5
fi
done
aws logs get-query-results --query-id $query_id_2 --output text --query 'results' > query2_results.txt
paste <(awk '{print $2}' query1_results.txt) <(awk '{print $2}' query2_results.txt) | awk '{if ($1 != 0) printf "%.2f%%\n", ($2 / $1) * 100; else print "N/A"}' > percentage_difference.txt

So you can see this in console like these images —

query 1
query 2
percentage_difference

Congratulations!!! You have done one more phase. Go to next phase.

If you find this post helpful, please give a clap in this post and follow me in medium and lets connected in linked in.

--

--

Farzanajuthi

I am an AWS community builder. I have passed AWS certified solution architect (CO3) exam). I love serverless technology and also share knowledge with others.