How I Built My Forest Fires Predictor

Sara-Grace Lien
AI @ UCI
Published in
2 min readSep 10, 2020

The climate is changing. There is no denying that. Floods have plagued South Asia leaving up to 12 million people homeless with nowhere to go. Places like Guatemala, Honduras, and El Salvador are seeing an extended dry season that has lasted up to 6 months. And for Australia, they started 2020 with the worst ever bushfire season.

I was curious about the Australian forest fires and wanted to see how I can use my knowledge of AI to predict the patterns of the forest fires.

So I got to work.

I found that since the 1980s, wildfire season has lengthened across the earth's surface. In 2018, California recorded its worst-ever wildfires. By 2019, wildfires had burned 2.5 million acres across Alaska.

I brainstormed several ways to wanted to use AI. (me talking to the camera)

That's when I found this dataset on Kaggle. It shows all the coordinates of the Australian and New Zealand forest fires with multiple categories.

Longitude

Latitude

Brightness

Instrument

I was kind of fazed by these categories not going to lie. I was always the computer vision kind of person. Never really ventured into categorical data. But you know is never too late to try something new.

This was difficult for me. I looked into many different ways to handle categorical data. I tried plotting it. Found out that it didn’t really do anything for me. Just liked the visuals. I was confused for a long time. But eventually, I did it.

I used a linear regression prediction model.

This was useful in several ways.

Linear regression is one of the simplest and most common supervised machine learning algorithms that data scientists use for predictive modeling. It is the relationship between a response variable and one or more predictor variables. Since our data has several categories that determine the presence of wildfires, this model would be especially useful.

What I did is I used the confidence level of the fires and compare it to the other categories. The confidence level is directly correlated to the presence of forest fires. I separated the confidence levels from the rest of the data and used it as the response variable. The rest of the data, like longitude, latitude, and brightness are used as predictor variables that go under dangerlvl. After splitting the data into training and testing data, I imported the Sci-kit learning metrics which has the linear regression model that I will be using.

In the training process, what happens is the model finds the best value for the intercept and slope, which results in a line that best fits the data.

After training and predictions, I take 25 records from the data and I plot my predictions on a grid. As you can see, the predictions are not perfect, but they are pretty close to each other.

As we continue to work on AI, we can venture into the different applications of AI into the climate change movement.

--

--