How to analyze website and store weblog data
This is the second phase of this project. This phase is a combination of three tasks. I will show every task step by step.
Task 1: Analyzing and understanding the lab environment
In this part, there is a brief description that which services are used in this project. Here an EC2 instance is running in a public subnet and this subnet belongs to a VPC, named Lab VPC. There is a security group that will control access to this instance. Also, there are some services to monitor logs and get insights into them. So there is no technical task you need to do in this step.
Task 2: Modifying a security group and verifying that the web application loads
In the second step, you have to allow TCP traffic in security group and it should be on port 80. There will be an EC2 instance. You have wait for the instance state — running and status check —2/2 checks passed. Then click on the marked red circle 1. Then you will see security group (marked red circle 3) under the security section (marked red circle 2). Follow the following images —
Then click on security group and select “Edit inbound rules” from Actions or scroll down the page and then select “Edit inbound rules” button.
Then click on Add rule (red marked 1) button and select “Custom TCP” and give 80 in port value (red marked 2) and then click on Save rules (red marked 3) button.
Then go to Instances in left side menu bar and then click on check box of instance and then copy the public IP address by click on green marked symbol and paste it into a new browser. Then add “/cafe” after the IP address like this 44.192.88.220/cafe/ . In my case, public IP was 44.192.88.220 . You have to give what you see in your EC2 instance.
DO NOT CLICK ON “open address”. Because you will use HTTP, not HTTPS. When you click on open address, it will open with https and it will give you error.
Then you will see the webpage like image -
Task 3: Observing and backing up the httpd access_log
In the third step of this phase, you have to open cloud9 console and then find out your httpd file and save it into your cloud9 directory.
In lab you will find that there is a tip where you can find httpd file. But in my case it was a different directory. So I will write how I find out that file and then save it into cloud9.
After getting cloud9 console, click on Open button.
Then type and you will see folder where your httpd file locates. If your file in /var/log then you have to go that folder. If it is in /etc/ folder. Then you have to go there. So depend on where your file locates you have to enter into it.
whereis httpd
As I have found that in /etc/httpd, so I will write following command cd /etc/httpd/. You will write yours. If it is in /var/log then you have to write cd /var/log/
cd /etc/httpd/
Then run following command. Change yours based on your path of access_log file. In this file, you can changed log data when you browse your website. You can see it in next phase of this project.
sudo tail -f /etc/httpd/logs/access_log
Then explore your website. Go every page — home, menu, order history and make a purchase. Then see all these are stored in your access_log. You can see every log there.
If you want to stop tail command then press Ctrl+C.
Then save your access log into cloud9 environment. In my case, it was following path, so I ran it.
sudo cp /etc/httpd/logs/access_log /home/ec2-user/environment/initial_access_log
sudo cp /etc/httpd/logs/access_log /home/ec2-user/environment/initial_access_log
If your access log in different path then you have to change it like this —
sudo cp /var/log/httpd/access_log /home/ec2-user/environment/initial_access_log
Here,
sudo means root/home/ec2-user/environment/initial_access_log permission
cp means copy
/etc/httpd/logs/access_log this is the path from where I copy it
/home/ec2-user/environment/initial_access_log this is the path where I want to save it
Congratulations!!! You have done second phase of this project. Go to next phase.