Clustering Indian States based on most visited Venues
Introduction:
There are around 29 states in India, can we find out which are the most visited states in India? and why do people visit those states? Is it because they want to do business? or Just want to travel? Some other questions are -Finding the most visited business place? -Finding most visited bus stations in the state.? -In which city people visit the bar most of the time? Well, there are many questions we can ask based on the location and data we have. The problem we are solving here is anyone who wants to know about what particular place is known for. Eg. Kerala is known for “Indian Restaurant Food” so if we plan to visit Kerala, We will at least visit some Indian Restaurant and experience some good food. Problem is to cluster the states based on their most visited venues and based on that providing good recommendation to people who are trying to know about what particular state/city is known for.
Data Information
- Wikipedia: To get the information about the total number of states in India, I will be scraping the Wikipedia page. Indian States
- Google API: Once We get all the states of India, I will be using Google API to get the Latitude and Longitude each state.
- FourSquare API: Once we get the location of each state we will be using FourSquare API to get the information about most visited state and what they known for.
Methodology
The aim is to find what are the most visited places in India, why people go to these states.
- Scrape states information from Wikipedia
2. Get the location of those states
3. Finding Most visited places Using FourSquare API
4- Clustering them based on their similarity
Result:
Clustered the states based on their similarities, as we saw that, most visited venus are Business Service, Hotel and Restaurant. These are some of the major states of India. Surprisingly Bus Stations are most visited Places in Maharashtra and ‘Bar’ is the most visited venue in Daman and Diu (This is perfectly accurate :P )