Heyy Zomato, where do Punekars eat??

Data Cleaning and EDA for Pune restaurants listed on Zomato

Akanksha Nagar
Analytics Vidhya
6 min readSep 3, 2020


Will we ever go back to the old normal? Probably not. Is this our new normal? Hell NO!!! But what we do know is that currently, we cannot go to restaurants with our friends to savor our favorite dishes.
So until we get to our new normal, let us dig into the data to maybe find out a place to flock to… Or to get away from the misconception that Joey loves Pizza.. Noo.. In Pune, Joey loves Biryani more!!!!

Below are some of the questions we will be answering-

  1. Which is the biggest chain in Pune? Is it McDonald’s? or Cafe Coffee Day? or Dominos?
  2. Biryani? Or Pizza? What does Punekars love?
  3. Out of money? Need to find out best-budgeted restaurants with impressive reviews and ratings?
  4. Too good to be true?? Have a good number of reviews, but the rating does not reflect so?

I collected this data by scraping around 300 pages of Zomato.
Before digging into the data, below are the links for the dataset and notebooks-

Data Cleaning

  1. Cleaning column values.
  2. Extracting digits for certain columns like Reviews, cost.
  3. Dropping duplicates values.

Let’s check how our data looks like —

A glimpse of our dataset

You know, as they say, Garbage IN Garbage OUT.. This data also has several issues and thus before doing any analysis we will need to clean this data.
1. Rest_name: Several restaurant names have special characters (\r\n) at the end of their name. This needs to be removed.
2. Cost, dine_reviews, delivery_reviews: We need to extract the numeric part from these columns and then convert these columns to integer or float to use them in the analysis.

We can see that Delivery rating, Delivery Reviews, and liked have maximum missing values. More than half of these values are missing. Below are my assumptions for this missing data —
1. For Delivery columns — Many high-end restaurants do not have delivery options. So if these values are missing, my take is that this value is not available as these restaurants might be dine-in only.
2. Liked — This column tells us the food that was liked by most customers. This might be missing due to an insufficient number of reviews.

Data Visualization

Now since we are done with basic data cleaning, let us start plotting…….

Maximum no of outlets in Pune?

Not Dominos or McD.. but Monginis is the biggest chain in Pune! Well, who could have thought that..

Khaau Gali in Pune?

Wow! Not Koregaon Park, but Kothrud has a maximum no of restaurants in Pune

Biryani or Pizza? What do We love the most?

See.. told you.. here Joey Loves Biryani more… Sorry Pizza lovers :(

Curious to know the most expensive (not for the faint-hearted) and most reviewed restaurants?

Amongst 8 most expensive restaurants, 3 eateries of Westin are on the list.

Can we really trust on those ratings?

OMG!! The rating is 5.0 ..but wait, only 3 reviews … This is exactly why I have trust issues.. duh!!
To get a fair idea of ratings, I have calculated a weighted rating based on the number of reviews for that particular restaurant. Below is how we can calculate a fair rating -
1. Find the total no of reviews for all restaurants combined.
2. Now calculate the weight of reviews = ( no of reviews / total no of reviews)
3. Weighted rating = rating * weight of reviews
4. The final step would be normalizing our rating to bring them in the range of 0 to 5 in this case.

Now you know where to go… Although most of these restaurants are cheap and thus may be preferred by many? Please note that these ratings are weighted and thus will be different from the rating on Zomato.

Localities for Most Expensive 20 restaurants?

Many location values have restaurant names as well.. So splitting them on basis of delimiter and extracting string after the delimiter

Restaurants considering value of the money?

Below are our criteria to find out about good quality budgeted restaurants -

  1. Cost < 1000
  2. Rating (dine or delivery) > 4.3
  3. No of reviews( dine or delivery) > 4000
Good Budgeted restaurants

Now you know where you could go to save some bucks!

Highly Rated and Highly reviewed Expensive restaurants

  • Marriot is most preferred amongst high-end restaurants
  • An interesting thing to note that none of them provide delivery services. Or people want to go out and dine into these expensive restaurants for the ambiance?

Generating word clouds for the most-liked dish

Whew!!! Did we learn something about food culture in Pune? Let’s see —

  1. Monginis is the biggest chain in Pune.
  2. Kothrud is the hub of restaurants.
  3. Joey loves Biryani
  4. If you are always broke like me… Run whenever you hear the name Westin… Just run!! Seriously!!
  5. Bund Garden Road is no normal road.. It hosts many expensive restaurants.
  6. Joey is broke.. but needs good food!! Many people in Pune prefer cheap restaurants as suggested by rating and reviews.
  7. For the value of money below are a few options —
    * Vaishali
    * Momo’s Cafe
    * Dominos
    * Cafe Goodluck (I mean, what’s not to like? Bun Maska, good. Jam, good. Tea, good!)
  8. Won a lottery! Head out to Oaks Longe- Marriott, Baan Tao- Hyatt, Mix@36- Westin to spend that money…

References —
1. https://www.kaggle.com/shahules/zomato-complete-eda-and-lstm-model
2. https://www.kaggle.com/parthsharma5795/finding-the-best-restaurants-in-bangalore

