Review of Yelp DataSet Using Tableau

Madeleine Moghadasi
The Startup
Published in
4 min readFeb 2, 2021
Designed By Freepik

Overview

Has it ever occurred to you that you pay a lot of money, and yet the service or product you get lacks quality? Even though many businesses offer exchanges or refunds, you don’t want to waste your time arguing with customer service. Therefore, choosing the best in your area first makes sense. You might be wondering how I should know that beforehand. Nowadays, there are countless websites gathering customer reviews and opinions of the services they received when shopping. Overall, it drives businesses to be more accountable to customers and makes a competitive market, which ultimately benefits consumers.

If you are looking for a directory containing information about a particular business, then Yelp is the place for you. Yelp has a listing for nearly every type of local business. Furthermore, over 250 million reviews were written on this website. So it’s a pretty good indicator to identify the best one for your needs.

In this report, I will investigate the Yelp data with the aim of providing some exploratory data analysis on it. I used Tableau for creating charts and maps and yelp-provided data.

Milestones

Data wrangling

The JSON data is imported and then converted to a Pandas data frame and then saved to.csv.Before being able to work on a visualization project in Python, there are a number of steps that must be taken.

I checked the dataset for duplicates, which there are none. Then I remove any columns that are unnecessary from dataframe. The dataset includes a column indicating whether a business is open or not. Because I am only looking for businesses that are active right now, I filtered out the ones that are closed. The column contains business categories, each business has a number of subcategories. I split those categories to form distinct business divisions. I discarded the rows with missing values in this column as well. Now it’s time to work on the data type. Corrections were made to some data types, which had been incorrectly identified.

Exploratory data analysis

After Tableau became more and more popular among analysts because of its great features and ease of use I have dedicated some of my projects to it.

Let’s dive right in:

First, I visualize the distribution of registered businesses on Yelp based on geolocation data. Data are available for most of the states and major provinces in Canada.

I then used treemap to visualize the popular business categories available in the dataset. Only the top 20 categories were selected from the hundreds of categories available. According to data, restaurants, shopping, and food make up almost one third of all.

Where are the best businesses located? I sorted out the states and provinces by the average star rating they got from the customers. Vermont is number one among other states. I then used color change to show review counts for every state. As the figure shows, Nevada and Arizona stand out in terms of reviews, but still maintain a good average star rating.

Therefore I dig deeper into these areas. Nevada, Arizona, and Ontario were separated. The state of Nevada appears to be the highest-rated among the three with only a few businesses receiving one star.

These results are filtered based on categories in these states. We can see the specifics of each group in every state. I switched to side-by-side bar charts and differentiated performance using stars with red and green colors corresponding to worst and best.

If we go through Nevada cities and search for the best ones, we will find Spring Valley, South Las Vegas, and 4321 W Flamingo Rd. Here’s how I did it. I classified cities into four groups determined by the average star rating and reviews received.

If one wants more information, I’ve listed out the best businesses in all services in these cities.

It would be interesting to me to learn more about Ontario business. Therefore, in order to display those, I applied packed bubbles based on the number of them. Once again, I show only the first ten. Restaurants make up about half of the economic activity in the city, and nightlife makes up the top list.

It might be good to go further and look behind the reviews and have an analysis of sentiment of the reviews. As we know, the high number of reviews does not indicate a positive experience.

--

--

Madeleine Moghadasi
The Startup

Make use of machine learning to shape stories about the real world | https://rebrand.ly/lsxinkk