ASOS: Using Machine Learning to Get Actionable Product Insights From App Store Reviews

Stoimen Iliev
7 min readJul 20, 2020

--

1. Introduction and Motivation

Every company wants to make data driven decisions when building their product. It is easy to make decisions on quantitative data — click through rates, customer lifetime value, bounce rates, etc., etc. But what do you do with all of the qualitative data that you have?

In the current post I will be working on getting actionable insights that can help make product decisions by using machine learning tools.

By creating this automated process I would be able to execute it often enough to track progress, or perform the same actions on any number of apps.

Based on my pashion for platform businesses I decided to choose the UK’s leading Fashion E-commerce platform — ASOS.

Initially, I wanted to base this on Zalando, but since their primary markets are Germany, France and Italy and since most of the reviews are in their local languages I decided to do it for a company that is a leader in an English speaking market.

1. Tools and Process

a. Scraping the data:

This is how the reviews look like at Apple’s App Store

App_Store_Scraper

Initially, I tried to make my own scraper with Import.io, but I encountered problems with the “Infinite scroll” functionality of the App Store and also I knew that with thousands of reviews I am going to reach the limits of the free account pretty fast. Because of that I decided that it’s a better time investment to build my own scraper using BeautifulSoup, but soon after I found an already developer scraper on GitHub — “App Store Scraper” by cowboy-bebug, so I decided that there’s no point in reinventing the wheel.

On the following image, with just a couple lines of python code I am able to scrape ASOS’s top 20 000 reviews from the UK App Store. They date back to 2011.

The scraping process took about 6–7 minutes.

B. Data Cleaning & Preprocessing:

After I scraped the data, using DictWriter I was able to save it in a very readable .csv file. Data was very clean, nicely formatted and ready to use!

Occasionally, as I was scraping, I got blocked from Apple and the scraped data contained error messages which caused problems later on. I handled them by adding the following attribute to the dict writer: “ extrasaction=’ignore’ “

Later on, I removed the column userName and filtered the data to contain only reviews over 30 character long because reviews like “Amazing” or “Terrible” don’t provide actionable insights.

To prepare the data for the upcoming analysis I did the following things:

- Removed unwanted characters

- Removed stop words (“the”, “a”, etc.)

- Made the entire text lower case

This was easily achieved with a couple NLKT functions.

I plotted the top 20 most frequently used words:

Most commonly used words

Overall, we can see a very positive list of words: easy, great, good, etc.

C. Data Analysis

First things first, let’s visualise the data so that we can get a better understanding of what we are dealing with. Below we can see the distribution of review ratings:

As we can see ASOS has majority of 5 star reviews, which is to be expected with the apps’ 4.9 rating at the Apple App Store.

This is how the average review rating per year and per month looks:

Average Review Rating per Month

Now if I was working at ASOS or any of their competitors I would wonder what happened between 2013 and 2014. And then why has the average review rating fallen since 2017?

I won’t be diving into this as it goes beyond the score of this project.

Now, at this point as a Product Manager I want to use the data in order to make decision about the product. As a result, I don’t want to use a 10-year time period, I want to use up to date reviews to base my decisions. I am also hoping that problems in the app from 2011 are already fixed in 2020. Thus, I am going to use the data from the last year and half — more specifically from Jan 2019 to July 2020, which is 2374 reviews.

Using Natural Language Toolkit (NLKT) and SentimentIntensityAnalyzer

It’s pretty great that in the App Store you can filter by star rating, but not all stars are made equal.

Example of two 1* reviews:

Absolutely disgusting service. Won’t be using them again. I have lost 200 pounds

Poor connection unable to select options

Using the SentimentIntensityAnalyzer from NLKT we can determine how negative or how positive a review is.

Here is the result:

Top 5 most positive reviews:

Top 5 most negative reviews:

Negative Reviews

Continuing further I am going to use only the reviews that are strictly negative, because negative reviews is where we can get opportunity for an improvement.

As a result, I am filtering the data to use reviews with compound rating below -0.5 and star rating below 4, because sometimes reviews like “No complaints” can be detected as negative.

Let’s see the most commonly used words in negative reviews.

To visualize the data further, I created a word cloud:

Only by checking the word cloud and most commonly used words I can already make an assumption for what problems are users having with the app, but lets try to automate the process and use ML to determine that!

Latent Dirichlet Allocation (LDA)

To get actionable results I am going to use Latent Dirichlet Allocation model (LDA).

LDA is a unsupervised machine learning model, which helps get insights from text. It’s used for topic modelling — the best thing to compare it to is “clustering for text”. The LDA tries to find what is common between different texts/documents/reviews and clusters them in as many clusters as you would like.

The benefits of LDA, compared to other topic modelling techniques is that it is easy to apply and is very accurate!

I ran multiple different versions of the LDA and have found that 5 clusters and 3 words each deliver most diverse and insightful suggestions.

Findings

Topics found via LDA:

Topic #0:
filter company delivery

Topic #1:
item delivery day

Topic #2:
app discount annoying

Topic #3:
item refund order

Topic #4:
order size item

Based on the LDA findings the above topics lead me to think that ASOS should focus on:

1.Adding filter that allows you to filter items/products by delivery

2. Users are very frustrated with item delivery time, item delivery date. Perhaps they need to be more clear about their delivery dates and times, or they had huge order delays

3. For some reason a big part of the customers are not happy with the way discounts work.

4. People are not happy with the refund process

5. People have problem with picking the right size of their item

Of course, If I am working for ASOS I would be most probably (hopefully) familiar with all those user problems so the generated topics would have even more meaning and I could potentially find them even more useful!

Going further

Areas for improvement

I haven’t used at all the Title field, which provides very rich data — typically a summary of the problem, so potentially it can provide even better insights than the Review. Another option is to merge Title + Review fields as both fields are supposed to describe the same problem, so we could potentially end up with more data on the same issue. Of course, extra words in the Title could also throw some things off.

I could also try to use Topic Modelling techniques such as LSA, PLSA & lda2Vec and compare results & potentially get even more insights.

Other use cases

  1. Competitor Analysis

Now that I have all of this nicely put in a python file I can very easily execute the same code and analyse the App Store reviews for some of ASOS’s competitors such as Zalando, AllSaints, House of Fraser, Farfetch, Missguided, New Look, Pretty Little Thing, Boohoo, Matches Fashion, Bestseller and Alibaba and create a SWOT competitor analysis based on Positive and Negative reviews.

2. Product Management

If I was working at ASOS as a Product Manager, I would narrow the time frame to 30 days and execute the script every month to get better understanding of my users feelings towards the app/service/product and get feature ideas, find bugs and improve the user experience and customer satisfaction.

3. Get insights for other apps

I can also use this to get informed decisions about any other app on Apple’s App Store. If I have a startup idea, I can do the same analysis towards my potential competitors and use the insights for positioning in the market

_____

If you like this post, please click and hold down the 👏 button for 10 seconds to show your support!

--

--

Stoimen Iliev

Pro Bono Machine Learning Consultant | Senior Product Manager & Certified Scrum Product Owner | MBA at Cornell Johnson | Fulbright Scholar | Software Engineer