How to Build an Amazon Price Tracker using Python

Datahut
datahuthq
Published in
4 min readJul 22, 2022

Note: The code is given at the end of the blog for free download

Everybody loves to get their products on amazon at their lowest prices. I have a bucket list full of electronic gadgets that I am waiting to buy at the right price. Price wars between e-commerce marketplaces force online retailers to change their prices frequently.

The intelligent thing would be to know when the price of an item drops and then buy that item immediately. How do I know if the price of an item on my bucket list is dropped? There are commercial Amazon price tracker software available as chrome extensions. But why pay when you can get the price drop alerts for free.

This is time to give my programming skills some workout. The goal is to track the prices of the products on my bucket list using programming. If there is a price drop — the link will be sent to me via SMS. Let’s build ourselves an amazon price tracker. We will build a basic price tracking tool to experiment with.

Also Read: Competitive Pricing Strategy: How Products Are Priced

  1. In this blog, we will build a web scraper from scratch using python to build a master file containing the product name, Product prices, and URL.
  2. We will build another web scraper that checks the prices every hour and compares them against the master file. This web scraper will also be built with python and will check for a price drop.
  3. Sellers on Amazon automate pricing. We expect at least one of our bucket list items will have a price drop. The script will send me a Price alert SMS if there is a significant price drop (say more than 10%).

How to build an Amazon web scraper in python

We are going to start with the attributes we need to extract. To build a master list, we will use python requests, BeautifulSoup, and lxml. The data writing will be using csv library.

Attributes we will be scraping from Amazon.

We will scrape only two items from an Amazon page for the master list, price, and product name. Note that the price is the sale price, not the listing price.

Importing the libraries

import requests

from bs4 import BeautifulSoup

from lxml import etree as et

import time

import random

import csv

Adding a header to the code

Websites, especially amazon, hate web scrapers or bots that access amazon data programmatically. Amazon has a heavy anti-scraping mechanism to detect and block web scrapers. The best way to get around this for our case is to have headers.

Headers are a vital part of every HTTP request as it provides essential meta information about incoming requests to the target website. We inspected the headers using Postman and defined our header as below.

header = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36",
'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate, br', 'Accept-Language': 'en-GB,en-US;q=0.9,en;q=0.8'
}

Building my bucket list.

The next step is to add the bucket list for processing. In my case, I have five items on my bucket list, and I have added them to the program as a list. You can add this to a text file and read it using python and process the data. A python list is enough for price tracking of a small bucket list, but a file would be the best choice if you have an extensive list.

We will be tracking only pricing and product name from Amazon.

bucket_list = [‘https://www.amazon.in/Garmin-010-02064-00-Instinct-Monitoring-Graphite/dp/B07HYX9P88/', ‘https://www.amazon.in/Rockerz-370-Headphone-Bluetooth-Lightweight/dp/B0856HRTJG/', ‘https://www.amazon.in/Logitech-MK215-Wireless-Keyboard-Mouse/dp/B012MQS060/', ‘https://www.amazon.in/Logitech-G512-Mechanical-Keyboard-Black/dp/B07BVCSRXL/', ‘https://www.amazon.in/BenQ-inch-Bezel-Monitor-Built/dp/B073NTCT4R/' ]

Extracting Pricing and Product name from Amazon

We will define two functions that return the price when they’re called. We are using Python BeautifulSoup and lxml libraries to extract the pricing information. Locating the elements on the web page is achieved using Xpaths.

See the image below. You open chrome developer tools and select the pricing. The pricing is available in a class “a-offscreen” inside a span. We write the Xpaths to locate the data and test it using the chrome developer tools.

We need to extract the price data and compare it with the master data to see if there is a price drop. We need to apply a few string manipulation techniques to get data in the desired form.

Building the master file by writing the data

We use the pythons csv module to write the scraped data to the master file. The code is shown below.

  1. The master file has three columns, product name, price, and the product URL
  2. We iterate through the bucket list and parse information from each URL
  3. We also add a random time delay, giving a helpful gap between each request.

When you run the code snippets above, you’ll be able to see a csv file named master_data.csv generated. You need to run this program only once.

The rest of the blog is available at the following link. This is due to code formatting limitations of the medium : How to build an amazon tracker

Originally published at https://www.blog.datahut.co on July 22, 2022.

--

--

Datahut
datahuthq

Get structured data feeds from any website through our cloud based data extraction platform. No coding or expensive DIY softwares required.