Tossing out Fake Amazon Reviews & reRanking the Results

Amazon is my outlet for retail therapy where I will drown myself in product research until I am fully satisfied I have selected the holy grail while it’s on sale. If you spend any time filtering and sorting as I do, you have probably found yourself frustrated with Amazon’s ranking.

My Top Amazon Shopping Frustrations:

  • The 5-star product with two reviews is next to the 5-star product with a thousand reviews.
  • The 5-star product with a thousand reviews also has 200 1-star reviews and nothing in-between.
  • You have to skim the comments to ensure the product isn’t a knock-off, refurbished or expired.
  • The product has changed over time and those 5-star reviews were for an older version before they started cutting costs.
That is how the reRank project was born:
https://rerank.jazmy.com

A pet project that generates an unbiased Amazon product ranking by identifying the untrustworthy reviews and re-sorting the list of results.

Note: Currently only available for US Amazon.com

Now whenever I want to search for a product, I can go through my reRank site and get a list of top products with reviews I can trust. I also added direct links back to the full report on ReviewMeta or I can view the price history on Keepa.

Technical Details

I found a site ReviewMeta which will generate a dynamic report of suspicious reviews on Amazon products and provide a new star ranking. What a genius idea! What if I could take it a step further? I used the Amazon API to filter the top 30 products, combined that with the ReviewMeta API and looked for a couple of my own pet peeves, such as the percentage of one-star reviews, in order to create my own unique ranking.

Technology

Laravel 5.7 using the following packages:

Getting Started

To begin I played inside Amazon Scratchpad in order to test different queries.

https://webservices.amazon.com/scratchpad/index.html

Roadblocks

1) API Limits

My goal was to allow users to do their own searches but Amazon puts a limit of one query per second and ReviewMeta’s free API would not appreciate thousands of simultaneous hits either. Instead, I did research on the top items searched for over the past year and ran only those lists.

2) Caching Restrictions

Amazon has strict policies against caching prices so I used their “Amazon Associate” widget to display each product price.

3) No Reviews API

When I call the Amazon Product API it returns a JSON result which I parse and stores in a MySQL database. Unfortunately, the Amazon Product API only provides an iFrame for the reviews so I had to write manual pattern extractions in order to grab the number of reviews for each star count. This became another barrier because Amazon saw my script as a bot and threw up captchas. I have to give kudos to Hartley Brody’s blog for helping me figure out what to do next: https://blog.hartleybrody.com/scrape-amazon/

My MVP

After a few days of energy drinks and sleep deprivation, I have a minimum viable product (MVP) that I am releasing to the world to see if it’s worth spending more time to develop.