Q#109: Snowiest months

Note: I believe it should be free and readily available for everyone to gain value from data, hence I pledge to keep this series free regardless of how large it grows.

You’re given the following dataset, containing information about a year’s worth of weather. Using this data, calculate the percent of time it was snowing each month. Note: this will require manipulating and classifying the existing data.

TRY IT YOURSELF

ANSWER

Step 1: Importing and Exploring the Data

Before we start, let’s import the necessary libraries and load the dataset.

%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Import data
weather_2012 = pd.read_csv('https://raw.githubusercontent.com/erood/interviewqs.com_code_snippets/master/Datasets/weather_2012.csv', parse_dates=True, index_col='Date/Time')
# Preview data
weather_2012.head()

The dataset contains various columns, including temperature, humidity, wind speed, and weather conditions. Our focus will be on the Weather column, which provides descriptions of the weather at each recorded time.

Step 2: Identifying Snowy Conditions

The next step is to identify rows where it was snowing. In the Weather column, snow-related conditions might be labeled as "Snow," "Snow Showers," "Blowing Snow," etc. We can classify these by looking for the presence of the word "Snow" in the Weather descriptions.

# Classify whether it was snowing
weather_2012['Snow'] = weather_2012['Weather'].str.contains('Snow', case=False, na=False)
# Preview the classification
weather_2012[['Weather', 'Snow']].head()

This code creates a new column Snow that contains True if the word "Snow" appears in the Weather column, and False otherwise.

Step 3: Calculating the Percent of Time It Was Snowing Each Month

Now that we have identified snowy conditions, the next step is to calculate the percentage of time it was snowing each month. We can group the data by month and calculate the proportion of True values in the Snow column.

# Extract month from the Date/Time index
weather_2012['Month'] = weather_2012.index.month

# Calculate the percent of time it was snowing each month
monthly_snow = weather_2012.groupby('Month')['Snow'].mean() * 100
# Display the results
print(monthly_snow)

This code extracts the month from the Date/Time index, groups the data by month, and calculates the mean of the Snow column. Multiplying by 100 converts this mean to a percentage.

Step 4: Visualizing the Results

Finally, let’s visualize the results to make the data more accessible and insightful.

# Plot the percentage of time it was snowing each month
monthly_snow.plot(kind='bar', color='skyblue')
plt.title('Percentage of Time It Was Snowing Each Month')
plt.xlabel('Month')
plt.ylabel('Percentage (%)')
plt.xticks(rotation=0)
plt.show()

This bar plot provides a clear visual representation of how often it was snowing in each month.

Plug: Checkout all my digital products on Gumroad here. Please purchase ONLY if you have the means to do so. Use code: MEDSUB to get a 10% discount!

Earn $25 and 4.60% APY for FREE through my referral at SoFi Bank Here

Tips and Donations

--

--