Weather Forecasting with Python Machine Learning (Beginner)

4 min readOct 30, 2022

Let’s predict the weather using a simple machine learning algorithm

Have you wondering how can people predict the weather? Even though sometimes it’s still failed to predict the weather in our local area. How reliable are weather forecasts?

A seven-day forecast can accurately predict the weather about 80 percent of the time and a five-day forecast can accurately predict the weather approximately 90 percent of the time. However, a 10-day — or longer — forecast is only right about half the time.

Weather forecasting is an important science. Accurate forecasting can help to save lives and minimize property damage. It’s also crucial for agriculture, allowing farmers to track when it’s best to plant or helping them protect their crops. And it will only become more vital in the coming years.

So, today we are going to make a simple weather forecasting to predict the upcoming weather based on available data.

Project Overview

The dataset I’m using here is Seattle weather from Kaggle. Here’s link to my Github. The Machine Learning models used are:

XGB Classifier
K-Nearest Neighbors Classifier
AdaBoost Classifier

Metrics that we’ll use for evaluating the models are accuracy and F1-score metric. We choose accuracy as a matrix because it will be used as a reference for algorithm performance if the data set has a very close number of False Negatives and False Positives. However, if the numbers are not close, then I should use the F1 Score as a reference.

Take a Look at Data:

There are 6 Variables in this Dataset:

4 Continuous Variables.
1 Variable to accommodate the Date.
1 Variable refers the Weather.

Data Pre-Processing:

We need to convert some column’s Dtype so it can fit to the machine learning model. First, we need to convert the Dtype on date from object to datetime.

The column weather contains the data value in the string form and we need to predict the weather data, so we convert it to an integer as label.

df['date'] = pd.to_datetime(df['date'])df['weather']=LabelEncoder().fit_transform(df['weather'])

Now we have a good data frame to be processed.

Exploratory Data Analysis:

Let’s take a look at the bar plot of Min and Max Temperature:

plt.figure(figsize = (15, 5))
fig = plt.plot(df['date'], df[['temp_min', 'temp_max']])
plt.grid();

We can see there is no missing data and outliers. There’s a pattern of the temperature every year (I mean, obviously).

December has coldest Days in a Year while August has the hottest days in an Year. December also has the least Standard Deviation which means, temperature vary least, while July has the most Standard Deviation.

We can see that wind and precipitation are weakly corelated. On the other hand, temp_max and precipitation are negatively correlated means they move in the opposite direction, similar with temp_max and wind.

Training Our Machine Learning Model:

Now, we’ll train our model using data we already cleaned.

features=["precipitation", "temp_max", "temp_min", "wind"]
X=df[features]
y=df.weather
X_train, X_test, y_train,y_test = train_test_split(X, y,random_state = 0)

We achieved highest accuracy on AdaBoostClassifier so we’re going to use it as our model.

Try Improving Our Model:

We’ll try hyperparameter tuning our model using GridSearchCV

parameters = {
    'learning_rate': [1, 2, 3],
    'n_estimators': [100, 500, 1000]
}cv = GridSearchCV(ab, param_grid=parameters, scoring='f1_micro', n_jobs=-1, verbose=3)
cv.fit(X_train, y_train)

using classification report we can measure the quality of predictions from a classification algorithm.

we get a little improvement in our model using hyperparameter tuning

Testing User Input:

Alright, we already got our result. This model needs “precipitation”, “temp_max”, “temp_min”, and “wind” as input.

Conclusion

Wait, that’ it? we made the weather forecasting? YES. It’s simple right?

Of course our model wasn’t as advance as the one they used on television. This is just illustration of how weather forecasting’s are made.

The credit for the two men who discovered the birth of forecasting as a science is Royal Navy officer Francis Beaufort and his protégé Robert Fitz Roy.

What’s ahead?

In the future we can use this model (with some improvements) to help improve our renewable energy sources so they become more reliable and efficient, for example we can identify optimal layout and geographical location of the solar and wind power plants.