AIN311 Project - Climate Change and Forest Fires - Blog 4

Improving Models and Random Forest Model Evaluation

Hüseyin Eren Doğan
3 min readJan 8, 2024

In the last week of our project, we’ve used the polynomial regression model to get some predictions in the future and here is some sample results for Adana in 2025 and 2050:

Since we’re not so satisfied with our models’ results, we’ve took a look again to our models and have seen we have forgotten something in the kNN-regression model. We tested the kNN model with different k values in the range of 1–10.

As a result of this, found the best k value is 2 for our model and got a lower MSE than before.

Random Forest Model

We’ve done a small improvement in kNN model but we were still not satisfied with our results, so we try to evaluate a new model that we’ll hope to get better results and decided to use random forest.

In our first two models, we’ve had just city and temperature values to predict number of fires. Since we had limitations on our data which we use in prediction because of the climate data has only temperature value in it, then we decided to use the values on forest fires data which is brightness, scan and track in our random forest model. So we need to do some preprocess now.

Firstly, we’ve found average brightness, average scan and average track values of fires based on cities and months. Then merged the resulted data with our forest data and got a data structure like this:

Then looked for detecting outliers in values of fires:

Since we’ve examined, detected and removed outliers from the data. Now the data and we are ready to evaluate random forest model.

The results of evaluations:

Mean Squared Error (MSE): 39.89

Root Mean Squared Error (RMSE): 6.31

Finally we got some good results for our models’ error rates. If we look at all of our results, we can see that the random forest model outperforms the others.

Linear Regression Model
Mean Squared Error: 7421.040521343406
R-squared: 0.2917232023289914
kNN Regression Model
Mean Squared Error: 8305.100189035917
R-squared: 0.20734703586251368
Random Forest Model
Mean Squared Error: 39.8989518902439
R-squared: 0.9995693243301826

Gökhan Çelik & Hüseyin Eren Doğan & Umut Şahin

--

--