Managing the Reviews
Recently, I wrote about my difficulty getting interviews conducted for my current project. And while I am still working that out, I decided to move on and start examining the written reviews the Delight Games app has.
Have you ever attempted to identify and then analyze a set of written reviews on a product? I have not, and I initially found it overwhelming. I knew it would be. I recalled that in my early discussions with the app’s creator he had stated they had 20,000+ reviews.
More like 30,000+.
The reviews are spread out across multiple platforms. They have them on Google Play, Amazon, Apple, and Windows Phone. So I’m looking at tons of reviews that are everywhere. How to manage?
I honestly didn’t know.
See, the app has been around at this point for about a year and a half. To me, that seemed pretty young. It seemed like, technically, any and all of those reviews could be analyzed. Obviously I could not analyze 30,000+ reviews — nor did I need to — but it seemed to me like all of those reviews could be candidates for analysis.
Let’s Start With the Stars
The easiest place to begin is by breaking the reviews up by rating. While I suspect that reviewers who give four and five stars might not be that much different (and the same goes for those who give one and two stars)from each other, looking at them that way gave me an initial handle on it.
However, when I took a look at Google Play, I could see that there were 27,609 reviewers who left five stars.
Oh boy.
Well, obviously I can do some sort of random sampling, right? I reached out to a couple of colleagues for advice. What would be a good amount to sample in a situation like this? I’m really not used to working with these kinds of numbers.
The initial response I got helped me settle in to stratified random sampling. Since the number of reviews dropped significantly from four stars on down (four stars has 4, 453 reviews and two stars has 247) it would need to be disproportionate. That is, it wouldn’t make sense to take a 25% random sample at each star rating. First, that’s just not manageable at the five star level. Second, I want more examples were there are fewer numbers in order to make sure I understand that rating correctly.
But Wait….I Can’t Use All the Data
A second person I spoke with really helped me see that my data set wasn’t as viable as I first thought.
“When was the app last updated,” she asked.
Crap. You know, that is a terrific question. And suddenly, things became more manageable.
The app has been updated numerous times. I have no idea how many — but plenty. If I included a review from the second month that the app had launched (and at this point it’s a bout a year and a half in), then I could be including irrelevant data. If someone loved or disliked the app for a feature that no longer exists….then that review really isn’t all that helpful in the current context is it?
So now the plan is to pull together the reviews from the latest version. Then I will most likely have to do the stratifies random sampling method — at least at the five star level. The other levels may be manageable enough to actually analyze everything. This app is really popular, and lots of people leave it five stars. My biggest concern has been what to do with this pile of five star reviews. Not a bad problem for the app developer to have I’m sure.
And that’s it. Those are the procedures I put in place. But I owe it to two great people who helped me think this through and see that it could be a manageable process that could turn out some great results.