The Best State in India (for me!)

An exercise in data collection

Ankita Prakash
7 min readJul 9, 2022

Last week, I started the course “Learn Data Visualization” by Ashris Choudhury of India in Pixels.

The first assignment of the course is to define six characteristics of a good state, and find datasets to compare all the states of India based on these six traits. Then one needs to find out the best state of India as per their definition.

I have put my twist on the task, and transformed it to “The Best State in India (for me!)”. The six characteristics that I have chosen are based on my personal preferences — traits that define a state where I would want to live. I must mention that the traits considered in this exercise may reek of privilege at some points. However, that’s a direct result of the personalization of the task to my tastes.

Understanding the article
The section headings refer to the traits under consideration.
Each section is divided into 3 parts: Why I have chosen this specific characteristics, the data collected and visible results, and further steps I can take to refine the analysis.

In this exercise, I have also tried to inculcate different sources and types of data. They have been highlighted all throughout to draw your attention to them.

Air Pollution

Ever since childhood, I have faced minor breathing issues. But the problems get aggravated in the presence of air pollution. I struggle to breathe outside during the winter mornings of Kolkata. Note that Kolkata is one of the least polluted metropolitan cities of the country. Hence, I don’t want to end up in a state where I am stuck at home and cannot go out because the air is so polluted.

The first variable that comes to mind regarding air pollution is the Air Quality Index (AQI). Fortunately, this data is heavily collected and is easily accessible.

Here is a table that presents the states with “POOR” air quality level (as on 9th July 2022 at 11:30 am), collected from AQI India, a realtime pollution monitoring platform.

Source

While Chandigarh, Delhi and Haryana have the worst air quality in the country, Meghalaya, Mizoram, and Manipur have the best, as evident in the following table.

Source

AQI India presents the realtime data on air pollution. To perform a better analysis, one can use a source of data that gives cumulative information on the air quality averaged over years.

Rainfall

I have never liked rains. As someone who has always had to travel locally, rains are a major source of problems. Roads get flooded and conveyance is highly affected. That is why I would not like to live in a state where it rains consistently.

Weather reporting remains one of the major applications of forecasting techniques to date. Thanks to it, weather data is collected with detail almost everywhere.

The India Meteorological Department presents the statewise rainfall information on their website.

Source

From the chloropleth shared above, the states shaded with a green color receive normal rainfall.

This graph also considers a short time period, 01–06–2022 To 08–07–2022. A better analysis can be based upon information collected over a larger period of time, at least an year.

Power Outages

I am a Gen-Z individual, who cannot live without the Internet. I need my social media and OTT platforms. Survival would be difficult for me in a state that experiences major power cuts.

Since this trait is an eccentric one and not what one would naturally expect in an analysis of different states in the country, I thought I may also tread The Road Not Taken for its data collection.

An article by Business Today presented a list of ten states facing major power outages in May 2022.

The states are: Delhi, Haryana, Uttar Pradesh, Uttarakhand, Andhra Pradesh, Bihar, Punjab, Rajasthan, Kerala and Jharkhand.

The article has data on the states facing the most power cuts. I wonder how the data for the states facing the least power cuts looks like. The next step here may be to find a direct source of data that provides information on all aspects of power outages (hours in a day, frequency, etc.) in different states.

Public transport network

As mentioned previously in the section on Rainfall, I have used public transport for travelling locally since a young age. I do not prefer to take an Uber or Ola for places that I can easily commute to by buses. Hence, I would like to maintain this habit, and that can only be possible in a state with a rich network of public transport.

Road Transport Yearbook (2017–18 & 2018–19) by the Ministry of Road Transport and Highways presents detailed information on this subject.

The following table contains information on the Total Bus Fleet and Buses in Public Sector (SRTUs) (State-wise) (As on 31st March, 2013–2019).

Source

Karnataka, Madhya Pradesh and Tamil Nadu are the three states that have the largest number of buses in the country.

A point of concern that arises in my mind here is that states with a larger area size may naturally have more transport coverage than small states like Goa, Himachal Pradesh, Sikkim, etc. It may be wise to account for the area factor when one analyzes the public transport network of the different states.

Restaurants

I am a big-time foodie. I love trying different cuisines and learning more about food in general. Eating out is a major part of my life, and it can only be nurtured by a bustling restaurant ecosystem. This is the next characteristic that my best state needs to have.

What better source to collect information on this subject than Zomato? Shruti Mehta has shared a comprehensive dataset on the restaurants listed on Zomato on the data science platform Kaggle. However, this dataset does not directly contain any information on the state that the restaurant is located in.

That is why I have taken reference of this analysis of the same dataset by Dipankar Nath, as available on Github.

Dipankar used a different geographical dataset to map the restaurants to the states they are located in and presented the information in a graph (added below).

Source

However, the colors of this chloropleth are not distinct enough. Only three states, Delhi, Haryana and Uttar Pradesh, stand out in terms of number of restaurants that have been rated by Zomato users.

The next move here would be to use the dataset compiled by Dipankar and perform a more detailed analysis of the number of restaurants in the different states of the country.

Events

Confession time, I am not a party person and I have never gone out clubbing to date. But what I am is a literary meet person, a cultural festival person, and even a tech conference person. I love attending events of this sort and spend most of my weekends doing that. That is why I prefer to live in a city that is eventfully alive.

I couldn’t come across any direct source of data for this characteristic. Governments have some data on the events hosted by the ministries. Since these events have a much larger scope, I thought of putting Google Trends to some good use.

I searched the term “events near me” and got exactly what I wanted.

Source

And The Best State for me is🥁

GOA!

Goa performed well, if not the best, in almost all the 6 characteristics that I collected data on. It is not highly polluted. It does experience excess rainfall, but I believe I will have to manage with that. Goa does not find itself in the 10 states that experience the most power cuts. Being a tourist destination, it has a decent public transport network and restaurants. And it is pretty eventful. (We don’t need data for that, do we?)

And that brings us to the end.

In this data collection exercise, I have tried to involve diverse sources of data with the motive that you start looking at the uncommon resources as well.

I absolutely loved completing this exercise. It tickled my creativity, and I had to think differently to figure out the process of doing what I set out to do.

There are a lot more weeks to go for the course. I am hopeful that the next assignments would be as interesting as this one was, if not more.

So make sure that you follow my Medium profile to stay updated.

--

--

Ankita Prakash

Ankita Prakash, an analyst at a startup, writes about business, analytics, writing, etc. and her personal journey as an early-career professional.