Forest fires analysis on Portugal dataset

Mano
4 min readMay 18, 2023

--

Forest fires have been major environmental concerns. It adversely affects wildlife, nature and impacts people too. This paper is an attempt analyze the burn areas with weather observations and indices.

Data & its limits

The data for this project is from Montesinho natural park, from the Tr´as-os-Montes northeast region of Portugal (Ref: LINK). Data is pulled from 2 separate databases, one from fire inspector data and 2nd every 30min metrological data. The time period is Jan 2000 to Dec 2003. This data set includes spatial location, burned area and weather observation i.e temperature, rain, humidity and wind. The vegetation data is removed due to bad quality data. It does not include time, date or year. Canadian system of rating fire danger is used. Each index is derived based on vegetation or weather conditions like rain, temp, humidity. It is not clear how indices are calculated. Also, the data does not include any accidents. So, we cannot predict any accidental fires like campfires, equipment malfunction etc. caused by negligence.

Analysis

EDA is done to understand what parameters may impact the burn area. For ease of analysis, the columns coordinates X, Y, months, days are converted to as.factors and data is summarized by month and weekdays. The plot 1 shows that the months of August and September has more burn area reported especially during weekends. This may be because more people visit parks during weekends.

Plot 1: Shows that August and September months have more burn areas reported, which is predominantly on weekends.

Further EDA was done with weather factors vs burn area. Plots 2,3,4,5 show the effect of temp, RH, wind and rain wrt burn area.

Further EDA is done to understand the influence of the indices. FFMC — Fine Fuel Moisture Code and ISI — Initial Spread Index have an impact on the burn area. See Plot 6 and 7 for overall data with FMC and ISI. Other indices did not have stronger correlation to impact burn area.

This can be further be seen on subset of weekend data, where FFMC and ISI are higher in the months of August and September.

Further analysis is run with data set with rain (rain>0) and without rain (rain == 0). The analysis indicates that weather conditions and FFMC have impact on the burn area. ISI is derivate of FFMC and weather conditions. So it seems to be causation effect. Hence removed for plot analysis.

A heat map is plotted to understand the impact of location. The heat map (see plot 10) shows certain coordinates (x,y) have more burn area. It is not clear if the location has an impact on vegetation or elevation etc. More data is required to understand the influence of location.

Plot 10: Heat Map with X & Y coordinates with burn area mean.

Conclusion/ Recommendations

Weather conditions i.e. temperature, humidity, rain and wind and indices i.e. FFMC can be used to predict forest fire burn areas. A heat map plot indicates location can also impact the burn area. However, it is not clear if location is impacted by only vegetation or any factors like elevation etc. or popular spots for visitors.

Data analysis shows that more incidents are found during hot and dry weather, i.e. when temperatures are high, and humidity is low, which are the months of August and September. More fire incidents may have been recorded since a greater number of people visit parks during summer break. However, the data doesn’t consider the human factor. As per NPS, nearly 85 percent of wildland fires in the United States are caused by humans. (Source: 2000–2017 data based on Wildland Fire Management Information (WFMI) and U.S. Forest Service Research Data Archive).

Based on the data, it is important to review other factors for analysis which have not been completely recorded or removed like vegetation or human factors. Analyzing the full set of data and for longer time can help understand if any other factors that may impact the burn area. This analysis suggests that it is imperative to understand the data, causation and effects of factors for analysis.

--

--

Mano
0 Followers

I am an engineer who always like to learn new things. This blog is my insights on trends, analysis and modelling from the public database.