Progress update on Data Analytics for Canadian Climate Services

Published in

birdhouse

5 min readFeb 26, 2021

Data Analytics for Canadian Climate Services (DACCS) is a project funded under the Canada Foundation for Innovation Cyberinfrastructure challenge and provincial scientific funds from Québec, Ontario and British Columbia. The project involves academics, government scientists and staff from regional climate service centers, as well as collaborators from Europe and the US. One of its aims is to build the software infrastructure necessary to offer Canadians scientifically robust climate products to support climate change adaptation decisions. The spirit of the project is not to create a large monolithic infrastructure, but rather to create a network of independent thematic servers offering climate data or analytical services. We believe this approach has the advantage of granting individual scientists and institutions full control over the data and services they share with the federation, while providing users a single access point. By enforcing standards on data formats, metadata conventions and machine communications, we hope to achieve interoperability with minimum coordination efforts.

The project has four work packages, WP1 focuses on user interfaces and interactions, WP2 on data and service nodes, WP3 on Earth Observation (EO) analytics, and WP4 on climate services analytics. This post is about progress made over the last months in WP4.

Progress in WP4: Climate services

WP4 has the mandate to offer a suite of software tools to create custom climate products. Climate products are varied, but could be projections of length and severity of heatwave in a given city around 2050, or risks of flooding in 2100. Creating these products requires accessing climate model projections, climate observations, and then processing these data to compute the quantities of interest and assess uncertainties. Over the last months, Ouranos and the Pacific Climate Impacts Consortium (PCIC) have been busy working on various open-source software libraries to facilitate the creation of climate products. The following provides an overview of recent advances.

Climate indicators

The Python ecosystem has a nice MetPy library to compute indices on weather data. We wanted to do something similar for climate indices, analog to ICCLIM but using xarray+dask to do the heavy lifting instead of custom Fortran. Development on the xclim library started in 2019 and is the home of many of our climate utilities, including over 50 different climate indicators often used in climate change impacts assessments. In the course of the DACCS project, we recently added indicators for maximum precipitation intensity computed on hourly data, the Richards-Baker flow flashiness metric operating on river flow, cold spell frequency, frost season length, as well as sea ice area and sea ice extent. A lot of work also went into fire weather indices developed by the Canadian Forest Service. A full list of indicators available can be found in the xclim documentation.

Infrastructures such as buildings, roads, dams or drainage networks are designed according to regulations mandating levels of tolerable risk. For example, an urban drainage network could be designed to withstand a rain intensity over 30 minutes that is only exceeded once every 20 years on average. The computations of these exceedance levels relies on statistical techniques that are now also part of xclim, facilitating the assessment of climate change impacts on design criteria and levels of risk. A notebook example demoing these functionalities is now available.

Canadian cities and businesses can draw inspiration from southern regions on how to cope with future climate conditions. For example, a city like Montréal could look at Boston for experience on handling heatwaves or droughts. This mapping between future site A and current site B is called a spatial analog, and xclim now includes six different algorithms that return the distance between the distributions of climate indicators at different sites. See API docs for details.

Climate projections are notoriously challenging, and a lot of attention is given to uncertainties. One way to assess how robust results are is to compare the outcomes of multiple independent simulations from different individual models. A variety of methods are then used to assess the robustness of the climate change signal. xclim now offers two different algorithms used in the IPCC AR5 report to calculate the robustness of climate change signals from model ensembles. See API docs for details.

Regridding

To compare model outputs to observations or to outputs from other models, we usually need to interpolate data on the same grid. There are multiple libraries that offer such regridding capability, but we wanted one that was targeted at geospatial analysis (e.g. can handle poles, land-sea masks), was as fast as possible and compatible with the xarray ecosystem. We selected xESMF, which is based on the ESMPy wrapper of the ESMF library. Over the last months, we contributed to the maintenance of xESMF, fixing bugs, improving documentation and implementing new features. For example, weights can now be passed from ESMF to xESMF in memory instead of disk IO, coordinate variables are auto-detected by cf-xarray, grid corners for conservative regridding can be computed automatically using cf-xarray, and we made improvements to masked values handling. Also, we ported mesh objects to xESMF in order to support area-weighted averaging over polygons, with support for multi-geometries (see example).

Statistical downscaling and bias adjustment

Climate model outputs do not exactly match ground observations, in part due to differences in the spatial representations (models describe average values over grids of hundreds of square kilometers) and model approximations. It’s often necessary to fix those biases before model projections are used as inputs into specialized impact models (fire propagation, flooding, agriculture, etc). Bias correction algorithms are meant to remove systematic biases from climate model simulations. These algorithms are however often computationally intensive and at Ouranos this has prevented their applications to very large datasets. We’ve recently implemented a handful of bias correction algorithms using xarray+dask: Detrended Quantile Mapping, Empirical Quantile Mapping, Local Intensity Scaling, Quantile Delta Mapping and a simple mean scaling method (see docs). Compared to our legacy code, these new implementations speed-up computations by roughly two orders of magnitude.

Infrastructure as code

Finally, a lot of efforts went into building an infrastructure that is well-documented and can be easily deployed by others. Each component of our architecture is a docker container available publicly, and the configuration of these containers into a unified platform is done by code hosted in a public repository (birdhouse-deploy). Individual WPS servers are all based on the same template (cookiecutter-birdhouse), which facilitates maintenance of multiple thematic servers. If you want to try out some of the functionalities offered, try the demo account for pavics.ouranos.ca, a JupyterLab instance whose Python environment includes the latest advances made by the team.

Next steps

For the next few months, WP4' attention will be focused on new indicators related to snow and cold-season processes, data cataloging, data access as well as improved documentation and examples. We’re also planning to add support for additional bias correction algorithms, including a multivariate method and another preserving the occurrence of extreme values.