Python script to track Amazon Prices!! | Daily Python #1

Ajinkya Sonawane
Analytics Vidhya
Published in
5 min readJan 4, 2020

This article is a tutorial on how to track the prices of products on any e-commerce platform using a simple Python web scraper.

Requirements:

  1. Python 3.0
  2. Pip

Install the following packages:

  1. requests — HTTP Library for Python
  2. bs4 — Library used for web scraping
  3. smtplib — Library for sending emails.
pip install requests bs4 smptlib

Let’s start by importing the libraries in the code…

import requests
from bs4 import BeautifulSoup
import smtplib

Next, we will write a function that will fetch the price of the given product. For this, we will require the URL of the product from Amazon(or any other e-commerce platform).

For example, I wish to buy these Nike Sneakers from Amazon, so I will copy the URL of this particular product

Snip of Amazon.in

Let’s define the function that will check the price of this product.

def check_price(URL, threshold_amt):
headers = {'User-Agent':"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88 Safari/537.36"}
page = requests.get(URL,headers=headers)soup = BeautifulSoup(page.content,'html.parser')title = soup.find(id="productTitle").get_text().strip()price = soup.find(id="priceblock_saleprice").get_text()[1:].strip().replace(',','')Fprice = float(price)if Fprice < threshold_amt:
alert_me()

The dictionary ‘headers’ contains the User-Agent information, i.e. the browser’s information. You can get this info by googling my user agent

Google Search for my user agent

The get() method in the requests package will get the HTTP response for the given URL. This response is parsed by the browser to give us the feel of the user experience.

This HTTP response saved in the variable ‘page’ is parsed using this amazing library Beautiful Soup. Beautiful Soup is a library that makes it easy to scrape information from web pages. It sits atop an HTML or XML parser, providing Pythonic idioms for iterating, searching, and modifying the parse tree. The best thing about the BeautifulSoup library is that it is built on the top of the HTML parsing libraries like html5lib, lxml, html.parser, etc. So BeautifulSoup object and specify the parser library can be created at the same time.

soup = BeautifulSoup(page.content,'html.parser')

This returns the HTML code of the webpage we had requested. Now, each element of the web page can be accessed. If you want to print this parsed content then do the following:

print(soup.prettify())

The output will be somewhat like this:

Snip of the output resulted from soup.prettify()

The best way to access each element is by using its ID. Almost all important elements on the webpage have an ID — unique text that represents that element. You can find this by doing an inspect element on Google Chrome.

Snip of Amazon.in and the Inspect Element tool

We can see that the ID of the element containing the title of the product is ‘productTitle’. Similarly, the ID of the element containing the price is ‘ priceblock_saleprice’. These IDs can change from platform to platform.

find() is used to find the element from the parsed HTML code.

get_text() is used to fetch the text of the found element.

strip() is used to remove any extra whitespaces before or after the text.

We convert the price from string format to a ‘float’. First, we strip the currency icon using slicing ([1:] — start from index 1 till the end). Then, we remove the comma (‘,’) by replacing it with a blank (‘’). Further, we typecast this string to a ‘float’.

After we have the actual price which can be compared with a value, we set the condition depending on the product price. If I want to receive an email when the price of these Nike sneakers goes below 3000 rupees, then I will set the condition accordingly.

If the price goes below the given threshold amount, the function to send me an email should be triggered.

To receive an email alert, first, we need to permit Google to allow less secure apps. To do this, search ‘google allow less secure apps’.

Google search for allowing less secure apps

Goto the ‘Less secure apps -Sign in — Google Accounts’ page and enable Google to allow less secure apps.

Snip of Less secure app access (Google)

Now we can use the Gmail password for authentication, but a safer way is to enable the 2-step verification by Google. Just search ‘Google 2 step verification’ and turn on the 2-step verification.

Snip of google search for 2 step verification

After enabling the 2-step verification, we can now set passwords for different apps. Search ‘google app password’ and visit the page for App passwords. Google will ask you to log in for authentication purposes. Now, we set the type of application, in this case, Mail and type of device, in this case, Windows Computer. After clicking on Generate, we get a popup with the password for the application.

Let’s write the function ‘alert_me’ to send an email when it is called.

def alert_me(URL,title, price):
server = smtplib.SMTP('smtp.gmail.com',587)

server.ehlo()
server.starttls()
server.ehlo()

server.login('YOUR_EMAIL','GOOGLE_APP_PASSWORD')

subject = 'Price fell down for '+title
body = 'Buy it now here: '+URL msg = f"Subject:{subject}\n\n{body}"

server.sendmail('YOUR_EMAIL','TO_EMAIL',msg)
print('Email alert sent') server.quit()

We start by instantiating the SMTP’s object by providing the host as ‘smtp.gmail.com’ and port number 587.

We then identify ourself to an ESMTP server using EHLO. You can read more on this from the python docs. Then we put the SMTP connection in TLS (Transport Layer Security) mode. All SMTP commands that follow will be encrypted.

We prepare the subject and body for the mail and concatenate them into one variable ‘msg’. This message is then passed onto ‘sendmail()’ which sends the email to the destination email address.

Snip of the email received

The email was sent, as I set the threshold amount more than the current price for testing purposes. To keep this script running, add a loop with sleep for an hour to keep checking after every hour.

I hope this article was helpful, do leave a comment if you liked it or have any suggestions.

--

--