AI for Forest Landscape Restoration

A predictive impact analytics model quantifies the social, economic, and environmental impact of investing in a particular Forest Landscape Restoration project.

Strivathsav Ashwin

Published in

Omdena

8 min readDec 25, 2021

This Omdena challenge was hosted with Trillion Tree Fund.

Authors: Deepali Bidwai, Emerson Carlos, Strivathsav Ashwin Ramamoorthy

**Figure 1:** Impact Overview and matrix of Trillion Tree Fund’s dashboard

“If a tree falls in a forest and no one is around to hear it, does it make a sound?” is a famous philosophical question about perception and observation. And similar to our question, does the benefits of forest landscape restoration to tackle climate change make sounds to the world, or it just falls, and no one hears, observes, or perceives.

According to the Trillion Tree Fund (1tfund), Climate change could cost the world ~$792 trillion in the next 80 years. Forest landscape restoration (FLR) helps mitigate those risks; for example, mangroves absorb 70–90% of storm surge. FLR could generate $7–$30 in economic benefits for every dollar invested. Yet, these co-benefits are undervalued by markets. This poses a major impediment to financing FLR, which faces an annual investment gap of around $400 billion.

Why is investing in FLR projects critical?

The ability of trees to absorb carbon dioxide and other gases from the atmosphere has long made them a valuable weapon in the fight against rising temperatures. A single mature tree can absorb 48lbs. of carbon a year and makes enough clean oxygen for 4 people to breathe fresh air.

According to the goals of the Paris agreement, the change in global temperature at the end of this century must be limited to 2 degrees Celsius and aggressively work towards 1.5 degrees Celsius.

The Intergovernmental Panel on Climate Change (IPCC) said that if the world wanted to limit the rise to 1.5C by 2050, an extra 1bn hectares (2.4bn acres) of trees would be needed.

How do we make it sound?

After understanding the project's scope, we quantified the damages from flooding disasters with or without the benefits of the FLR project on the particular region of the United States in USD. Datasets related to disasters, tree benefits, and satellite images were consolidated.

The project followed a standard data collection pipeline, pre-processing, consolidation, EDA, model development, and dashboard deployment.

Data cleaning and data consolidation

The raw data has been processed by omitting unwanted records, replacing missing or erroneous data, standardizing date and time, and binning some continuous attributes. Imputing missing categorical values with the most frequent ones, imputing missing numerical values with the mean or median, imputing some values (under specific conditions) with unsupervised Machine Learning techniques.

After data cleaning has been consolidated, the team prepared a summary document capturing the list of attributes and mentioning why doing an EDA for this dataset.

Exploratory Data Analysis

Natural catastrophes, such as floods, landslides, storm surges, tsunamis, earthquakes, cyclonic winds, and wildfires, are becoming increasingly frequent and intense worldwide, highlighting the need for a more holistic strategy to dealing with them. The International Emergency Disasters Database (EM-DAT) documented an annual average of 363 disasters from 1990 to 2020, with floods and storms being the most common.

**Figure 3:** EDA of the count of disasters and total deaths by disaster type

The average annual total disaster fatalities in the period are 170,984 people. With earthquakes being accountable for the fatalities of at least 1.24 million people globally. Storm and flood also account for almost 1 million people (the true number may be much higher due to nuances in data gathering).

Average annual economic losses total more than US$107 billion. Annually, 175 million people have been affected by disasters during 1990–2020. Riverine floods, tropical cyclones, and convective storms account for most of the property damage.

**Figure 4:** EDA on number of fatalities and total damages based on each disaster

Team also used auto EDA libraries like Sweetviz, D-Tale, Pandas Profiling, and Autoviz for gaining quick insights into the datasets.

Modeling

Initially, all the datasets related to forests, disasters, and tree benefits were collected. After applying the pre-processing algorithms such as log transformation and imputation, the datasets were used for modeling. There are two types of algorithms in machine learning — classification and regression models, depending on the final output.

We modeled a regression model for the training data to quantify social, economic, and environmental impacts from flood damages and forecast the impacts for the next 40 years. The regression model allows us to predict a continuous dependent variable (y) based on the value of one or multiple independent/predictor variables (x). As shown below, the dependent/predicted variable (y) will be the Flood Damage Cost in the US while the independent/predictors variables (x) are the variables that influence the Flood Damage.

**Figure 5:** AI/ ML-based model on the number of inputs

Following models were created:

Time series forecast (ARIMA) to predict GDP, population, and Inflation rate
· Various regressors to predict numerical values, for example, damages, the number of deaths.
· Various classifiers to predict categorical values, for example, biomass losses, wildfire datasets.
· Classification models on National Forest2_Bienville-short for finding which types of trees are in Abundance.
Regression techniques for Time_series_US_1980–2021 for finding total damages cost.
Trained U-NET model for tree cover loss got an IOU Score of 0.81.

Team also used Pycaret to create the 2 regression models: the flood damage cost without the FLR project of Trillion Tree Fund that consists of Xn predictor variables to quantify it and the flood damage cost with the FLR project of Trillion Tree Fund that consists Xn of predictor variables with Z Benefits of Trees variable to quantify it.

Here is an example of the performance of the model using the Pycaret library.

Performance of all Models

Here, we use the function compare_models of the Pycaret library to find the best algorithm. Based on the metrics, the function ranks the model. We find that the first best model for the dataset is an ADA boost classifier. Next, we can take the ADA boost classifier individually on the data and do hyperparameter tuning.

Create Model

**Figure 7:** Creation of a model and measuring its performance across different metrics

Tune the Model

**Figure 9:** Performance of the regression model on train and test set

Import as a pickle file to integrate it.

Figure 10: Python code to pickle a file

Following are the examples of pickle files that were created.

**Figure 11:** List of pickle files developed based on the inputs

GIS

Apart from the tabular datasets, the team had extracted satellite images from the GEE (Google Earth Engine) related to tree cover images. Once the preprocessing of satellite images was done in the GEE (Google Earth Engine) platform, they were used to train a U Net model. The U Net model is primarily used in computer vision applications to do the segmentation of images. Tree cover images were extracted for years from the year 2003 to 2020. After that, we used image augmentation techniques to increase the number of images in the training dataset.

Deploying the Streamlit app to Heroku

Streamlit is a web app framework to deploy machine learning models locally using Python. Heroku is a cloud-based Platform as a Service (PaaS) to deploy modern apps onto the internet, and it has been used to deploy our Streamlit app.

The team has created onetfund_app.py using Streamlit. Now it’s time to deploy the app using Heroku.

Step 1: Run the Streamlit app locally.

To run the code locally with Streamlit, we need to open our terminal/prompt, locate the directory where our onetfund_app.py python file is saved, and run the following command.

# Running Streamlit locally

Streamlit run onetfund_app.py

The window will open automatically in the browser.

Step 2: Create and fork the repository on GitHub

The team has created a repository called dashboard-Heroku for our app.

After creating the repository, click the “fork” button.

All the files that are needed to deploy on Heroku are provided in the repository.

The repository comprises the following important files.

-Readme- This file will provide details about our app.

-Python files- While designing an impact analytics dashboard, apart from the main frontend landing page, 6 other pages to display the quantified social, economic, environmental, and financial impact of investing in a particular FLR project were created.

-Pickle files- All the models are saved as pickle files.

Apart from creating the above files directly related to the Streamlit app, the following files were created.

-Procfile- The Procfile is created to run the setup.sh file and Streamlit web application.

-requirements.txt- All the libraries we will use in our python script are added in this file. This file tells Heroku to install all required python libraries needed to run our application.

-setup. sh-This file is created to take care of all server-side issues like a port number, which will be added to the configuration.

Step 3- Connect to Heroku

Once we have all the required files, it’s time to set up our app to interact with Heroku.

Head over to Heroku and create an account. Once we are on the Heroku dashboard, click on create a new app. Here we have an option to select your region.

**Figure 12:** Creation of new app on Heroku

Next, in the deployment method, click on GitHub and connect our GitHub account with Heroku. Once we are connected with our GitHub account, type our repository name to save all your files.

**Figure 13:** Creation of repository on GitHub

The team has enabled an option for automatic deployment, so whenever there is a change in our web application files on GitHub, it will automatically deploy our web application on Heroku.

We can see it installing all required python libraries and dependencies in real-time. Once it’s done, we will see the message: Your app was successfully deployed, and when we click on the View button, it will open up our app.

**Figure 14: Dashboard of Trillion Tree Fund**

Conclusion

Forest land restoration (FLR) helps mitigate climate change risks like floods and wildfires and can be economically beneficial. Keeping in mind Trillion Tree Fund’s mission of mobilizing conservation finance to restore 1.2 trillion trees and regenerate ecosystem, which would cancel out a decade of carbon emissions, generating jobs, and lessen the monetary and social impact of disasters; a team of collaborators worked on building a predictive impact analytics dashboard for quantifying the social, economic, and environmental impact of investing in a particular Forest Landscape Restoration project.

AI for Forest Landscape Restoration

A predictive impact analytics model quantifies the social, economic, and environmental impact of investing in a particular Forest Landscape Restoration project.

This article originally appeared on the Omdena blog.

Written by Strivathsav Ashwin