Yelp’s Best Burritos

Nick Campanelli
7 min readJul 29, 2019

Classification and Sentiment Analysis — Part 3

This is a three part project working with Yelp’s Open Dataset, an “all purpose dataset for learning.” In this, part 3, we will work with sentence level sentiment analysis using VADER and a technique for identifying undervalued establishments. Part 1 here, part 2 here. Code here.

In parts 1 and 2 of this three part project we worked on building a simple text classifier. Even with simple methods of pre-processing, feature selection and model implementation we still observed high classification performance in a binary target variable. The third part of this project will move away from this classifier and work instead with sentiment analysis on the set of reviews that contain the word ‘burrito’ (or ‘burritos’). This is a set that contains 51,764 reviews for 3,279 unique restaurants.

The goal here is to separate sentiment concerning a specific part of the review, specifically burritos, from the overall star rating of the restaurant. The idea is that reviews might not always contain information that is particularly useful to what a reader wants to know. If we are interested in finding the best burritos in San Francisco, for example, just looking at the establishment with the highest star rating won’t necessarily get us to our goal. Perhaps the place that serves the best burritos has a poor star rating because they don’t have vegetarian options or they have really slow counter service. By separating out sentences from each review that contain our word of interest and running them through sentiment analysis we can assign a “burrito sentiment score” to each review and then use those scores to find the establishments that inspire the most positive burrito sentiment. Let’s get down to it!

Getting Burrito Sentences

The first thing we’ll have to do is extract the sentence or sentences that contain some form of the word “burrito” from the rest of the review. To accomplish this we’ll first take our plain text string and split it on sentence ending punctuation using Python’s regular expressions. (Aside: There are a set of reviews with no punctuation for whatever reason so we’ll need to first check for punctuation before splitting on it.) Once the sentences have been split into a list of strings we can use a simple list comprehension to pull out the sentence or sentences that contain “burrito”. The variable containing these sentence or sentences is then returned.

As we’ve done previously, using this function with pandas apply method allows us to simply apply this transformation to every row all at once. For a clearer idea of what we’re doing in this step, here’s the output of this function on a full text review. The only thing returned by our function is the sentence that mentions burritos.

Full review to sentence of interest. Only the sentence mentioning burritos is returned

Sentiment Analysis with VADER

VADER, or the Valence Aware Dictionary and sEntiment Reasoner, is a “lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media, and works well on texts from other domains.” (Github) VADER offers some advantages over other sentiment analyzers that make it perfect for our use case.

  1. It is easy to install and use. A simple “pip install” command will get it up and running on Python 3.
  2. It works quickly! I also tried Google’s Natural Language API, perhaps the gold standard of sentiment analysis, but the run-time for performing sentiment analysis on all our sentence was well over a day.
  3. It performs quite well on social media text but generalizes well to other domains.
  4. No training data! Because it is a rule based system we can just plug and play.

If you’re curious about how it works and performs on different use cases, definitely check out the paper published by the authors. It is well written and easy to understand. Let’s see how it works in practice.

The output of the sentiment intensity analysis, given a string, is a polarity dictionary with 4 measures, positive, negative, neutral, and compound. Positive, neutral and negative are percentages of words that fall into each class in a given sentence and add up to 1. The compound score is the more useful measure as a true sentiment score, offering a value from -1 (most negative) to 1 (most positive). The package calls this compound measure a “normalized weighted composite score.”

To get a sense for how this works, an example is reproduced below. The first block of text is the original review. Then, as detailed above, we’ll pull out the sentence that contains the word “burrito”. Finally, this sentence is run through the intensity analyzer to get our polarity dictionary back. As expected, the sentiment concerning this specific burrito sentence is positive and neutral with a compound score of 0.64. Correctly classified by the analyzer.

Positive sentiment test for the analyzer

Let’s also look at a review with negative sentiment to see if the analyzer can correctly polarize the sentence. This review is obviously quite poor, and the analyzer picks that out, identifying the negative sentiment associated with “fake egg” and “not very appetizing”. The compound score is -0.63, or quite negative.

Negative sentiment test for the analyzer

Now that we have a tool for assigning a sentiment intensity to each burrito sentence, we can use the scores achieve a composite rating for every restaurant in our sample.

Undervalued Burritos

As mentioned previously, one might expect star ratings to be a good starting spot for finding the best burritos. So let’s begin there. Looking at all the reviews that mention burritos, grouping by restaurant and only keeping the restaurants that have >10 reviews(793 restaurants) gives us this list (reproduced below). The best burrito restaurant according to this ranking system is Pollos LaChuya in Tempe, Arizona with a mean star rating of 4.92 based on 12 reviews. However, their mean burrito sentiment score is fairly low, 0.16. Their burritos don’t make anyone feel too strongly, apparently.

Top restaurants (>10 reviews), ranked by stars, in our burrito reviews dataset

By changing the argument given to sort_values, we can instead look at the top 10 restaurants based on mean sentiment score. The new list is completely different. The new number 1 is Rosarita’s Beach in Las Vegas, Nevada (although now closed — not a great review of this method) they garnered a mean sentiment score of 0.50. Their mean stars would tell us this is a poor restaurant, but apparently their burritos make reviewers feel quite positive.

Top restaurants (>10 reviews), ranked by burrito sentiment, in our burrito reviews dataset

To dive further into this method, let’s look at a few samples of burrito reviews from each of the top ranking restaurants. First are three reviews for Rosarita’s Beach (top based on sentiment), the next three reviews are for Pollos LaChuya (top based on stars).

Reviews for our top restaurant when ranked by sentiment
Reviews for our top restaurant when ranked by stars

What is immediately apparent is that the reviews for Rosarita’s Beach have more features (words) to work with. This results in a more confident sentiment analysis. For example, the first review shown for Pollos LaChuya was labeled as totally neutral by the sentiment analyzer. While this is correct given the narrow question being asked, there is a high chance that a restaurant with ‘phenomenal meats’ also has phenomenal burritos. Alas, therein lies a limitation to this method.

However, this does not make the results of this method invalid. As an attempt to use sentiment analysis to yield potentially undervalued restaurants, working with sentences that contain a keyword of interest seems to hold promise. To improve performance and reliability there are some tweaks we could implement. First, getting more reviews for each restaurant would definitely help. As illustrated above, sentences classified as totally neutral really hurt a restaurants ranking. Getting more reviews would work towards solving this problem because we could drop the reviews that don’t really tell us anything about the quality of the burrito. Sentences such as “I think I want to try the burrito next time.” are actually surprisingly common and weight down reviews that actually praise the quality of the burritos.

The second possible tweak is including more information for the sentiment analyzer to work with. Oftentimes, a review will include potentially useful information AFTER the sentence about the burrito, as shown in the following example: “I tried the burrito again. Last time it was terrible but with the new cook it’s delicious.” This review would get classified as totally neutral given the current method because all the useful information is contained in a second independent sentence that doesn’t include the keyword. This could prove tough to implement, but perhaps searching for a keyword such as “it’s” in the next sentence — hopefully referring to the burrito — could allow this extra information to be kept around when parsing sentences.

This method also has potential for generalization. Want to find the yoga studio with the best vinyasa class or the movie theatre with the comfiest seats? By just altering the keyword we’re searching for and pulling sentences of interest out of reviews we should be able to find undervalued establishments in all realms of business. This could be particularly useful in a world where Yelp may or may not filter certain reviews to improve overall star ratings (links to Forbes).

Regardless, thanks for sticking around as we played around with Yelp’s Open Dataset for learning by building a classifier and looking at a method for finding undervalued establishments. Let me know below if you have any questions. Thanks for reading!

--

--