Photo by Agence Olloweb on Unsplash

Automize reporting process using python

Use the example of the US Wildfire Daily Report

Lin Zhu
Published in
6 min readMay 9, 2023

--

Reports are widely accepted mediums to summarize and spread information. Therefore, producing reports becomes a commonly required task in the work. The majority of reports such as daily sales reports, invoices, and event monitoring reports normally already have fixed templates and clear data flow. Such reports have limited flexibility to be re-designed but require a great effort to update with the new data inputs regularly.

Thanks to Python libraries such as Jinja and xhtml2pdf, It is possible to reduce redundant manual reporting work and improve work efficiency via automizing the whole reporting process by programming.

This article takes the U.S. Wildfire Activity Monitor Report as an example to guide you through the steps of using Python to create reports, which include

  1. Build the layout template
  2. Create the contents of the report
  3. Render the Final Report

Introduction

The U.S. Wildfire Activity Monitor Report created in this article aims to summarize daily wildfire activities within the U.S. land and the statistics in each state.

The Data source comes from NASA FIRMS, which provides an open API to access the current & historical wildfire datasets detected by LANSAT/MODIS/VIIR globally. The returned data of API is in CSV format and contains the information (e.g. brightness, latitude, longitude) of the wildfire centers.

snip of the API returned raw data

The requested Python libraries are,

  1. Main Python Libraries

Jinja2 is a web template engine for Python and supports text-based templates such as HTML.

xhtml2pdf provides functions to render the HTML page into PDF.

2. Other Python Libraries

Geopandas provides functions to process geospatial data and geospatial operations.

Geoplot is an extension of cartopy and matplotlib for geospatial plotting.

Seaborn is a library for making statistical graphics in Python. It builds on top of matplotlib.

The whole process of generating the report can be simply divided into three steps 1) Building the report Layout 2) Creating the contents for the report 3) Rendering the report. The graph below illustrates a detailed workflow.

reporting process

Step 1: Build the Report Layout

The report layout is built in HTML, thus, a basic understanding of HTML is required in this step.

Before creating the report layout in HTML, we can first draw the layout idea on a draft to decide how many sections will be included in the report and the size of each section. By default, the report here is created on the A4 page (210 mm x 297 mm/595 pt x 842 pt).

Transferring the designed layout into HTML, it is easy to use the Div & Table structure. The table helps to structure e the page by defining rows & columns. The tutorial here provides good instructions on how to build a layout in HTML. Click here to get the final report HTML layout template.

Layout draft to HTML layout template

Step 2: Create Contents for the Report

Based on the designed layout in Step 1, There are three parts in the report contents: 1) Summary Paragraph 2) Active Fire Map 3) Fire activities statistic Chart. Each part has its own requirements such as auto-filled values and geospatial visualization graphs.

Summary Paragraph

The summary paragraph requires auto-generated values such as the current date and the sum of wildfire numbers. These values can be simply read out from data got from API. In HTML page, we need to define these values as input variables based on Jinja delimiters like below.

<tr>
<td style = "height: 107pt">
<h1 style = "font-size:20px">summary</h1>
<p style = "font-size:15px"> On <b>{{date}}</b>, there are in total <b>{{number}}</b> wildfires are detected within the U.S. <br>
<b>{{state}}</b> has the largest number of the wildfire with <b>{{number_state}}</b> across the country.
</td>
</tr>
Rendered Summary paragraph in PDF report

Active Fire Map

The Active Fire Map visualizes the locations of active wildfire centers. Firstly, the raw data from the API return need to be converted into geodataframe with geometry attribute, then using geoplot to visualize the data as points on the map. The plot will further be saved as PNG and taken as the source of the image in the HTML layout template.

  1. Convert raw data into geodataframe
# convert into geo dataframe
import geopandas as gpd
from geopandas import GeoDataFrame
from shapely.geometry import Point

geometry = [Point(xy) for xy in zip(fire_df.longitude, fire_df.latitude)]
fire_df = fire_df.drop(['longitude', 'latitude'], axis=1)
fire_gdf = GeoDataFrame(fire_df, crs="EPSG:4326", geometry=geometry)
fire_gdf.head()

2. Visualize the points on the map

# matplotlib draw maps fire activities today
import geoplot as gplt
import geoplot.crs as gcrs
import matplotlib.pyplot as plt
import mapclassify as mc

contiguous_usa = gpd.read_file(gplt.datasets.get_path('contiguous_usa'))
scheme = mc.Quantiles(fire_gdf['bright_ti4'], k=5)
ax = gplt.polyplot(
contiguous_usa,
zorder=-1,
linewidth=1,
projection=gcrs.AlbersEqualArea(),
edgecolor='white',
facecolor='lightgray',
figsize=(18, 12)
)
gplt.pointplot(
fire_gdf,
hue='bright_ti4',
scheme=scheme,
cmap='Reds',
ax=ax
)

plt.title(f"Fire Detection in The U.S. on {date}")
plt.savefig('fire_plot.png', bbox_inches='tight')

3. HTML layout template

<tr valign = "top" style = "height: 300pt">
<td width = "100%" style = "text-align: center">
<img src="Business_Report/fire_plot.png"/>
</td>
</tr>
The rendered Wildfire Map in the PDF Report

Fire activities statistic Chart

Similar to the Active Fire Map, the fire activities statistic chart will first be plotted using Seaborn and then saved into PNG as the source of the image in the HTML layout template.

  1. Visualize the statistical chart
# matplotlib draw charts
import seaborn as sns
sjoin_gdf = gpd.sjoin(contiguous_usa, fire_gdf) #Spatial join Points to polygons
df_grouped = sjoin_gdf.groupby('state')["index_right"].agg(['count'])
df_grouped['state'] = df_grouped.index
# Reorder this data frame
df_grouped = df_grouped.sort_values(['count'], ascending=False)
plt.figure(figsize=(12,7))
ax = sns.barplot(
y="count",
x="state",
data=df_grouped,
edgecolor="none",
errorbar=None,
color='red')
ax.set_xticklabels(ax.get_xticklabels(), rotation=40, ha="right")
sns.despine(left=True, bottom=True, right=True)
plt.tight_layout()

2. HTML layout template

<tr valign = "top" style = "height: 300pt">
<td width = "100%" style = "text-align: center">
<img src="Business_Report/fire_chart.png"/>
</td>
</tr>
The rendered chart in the PDF Report

Step 3: Render the Final Report

In the last step, we can combine the HTML layout template and Python functions together to 1) fill in the variables into the template 2) render the final report into a PDF page. The rendering process takes the functions from the xhtml2pdf library.

  1. Jinja module to fill in variables
from jinja2 import Environment, FileSystemLoader
env = Environment(loader=FileSystemLoader(r'Python_Lab\Business_Report'))
template = env.get_template("report_layout.html")

template_vars = {"date" : date,
"number":total_number,
"state": top_state,
"number_state": top_number}
html_out = template.render(template_vars)

2. Render the HTML to PDF

def convert_html_to_pdf(source_html, output_filename):
# open output file for writing (truncated binary)
result_file = open(output_filename, "w+b")
# convert HTML to PDF
pisa_status = pisa.CreatePDF(source_html,dest=result_file)
# close output file
result_file.close()
return pisa_status.err

convert_html_to_pdf(html_out, 'report.pdf')
Final Report in PDF

Click here to get the full code of the article

Follow me and subscribe to my newsletter to get more articles on

  1. Geo-spatial & Risk Analysis
  2. Climate Risk Insights
  3. Programming Tutorials

--

--

Lin Zhu
Geek Culture

spatial science | work in risk analysis | programmer