Condensing informations from a large numerical simulation of the ocean circulation with Pangeo: vertical mode decomposition

Published in

pangeo

6 min readMar 23, 2022

Unravelling physical processes hidden in large datasets (e.g. from numerical simulations or observations) usually amounts to finding the best way to reduce the volume of data while maximizing the amount of relevant information. For ocean dynamics and more specifically tidal motions, such reduction entails a decomposition on dynamically relevant basis functions called “vertical modes”. We recently carried on this type of diagnostics on a ~500 TB dataset from eNATL60 simulation, a NEMO based 1/60° (~1.2 km) numerical run of the North Atlantic ocean circulation. This challenging endeavor was made possible through intensive use of xarray, dask and xgcm and zarr (some of this wrapped through the xorca package).

A tiny bit of theory: what is a “vertical mode decomposition”?

It is kind of similar to a spatial Fourier mode decomposition, but using a set of basis functions which are relevant for ocean dynamics (more details can be found in any decent textbook on ocean/atmosphere dynamics) and depends on the (mean) vertical distribution of water density variations — the so-called “stratification”. The core principle is that one goes from a variable that has a dependence on the vertical (which translates into 300 vertical levels in eNATL60) to mode amplitudes, and that by truncating the number of modes retained (e.g. 10), one condenses data while retaining the dynamically relevant information.

To illustrate this, let us look at a single vertical profile of the instantaneous horizontal velocity in the model. The decompositions goes as follows (illustrated here at a single horizontal point and time instant):

Vertical mode decomposition of a single instantaneous vertical profile (the “reconstructed” profile in the left panel uses 10 modes)

The original signal (black line, left panel) is decomposed into a sum of contributions (colored lines), each consisting of an amplitude (u_n, a number in this case) multiplied by the vertical mode 𝛟_n. The latter are prescribed, computed from the mean stratification. In this example, retaining only the first 10 modes — ten numbers — allows to capture 80 % of the vertically integrated squared signal (kinetic energy), which is on 300 vertical levels.

To the chase!

The corresponding workflow is as follows:

1. Compute the mean stratification

It consists of a sequence of time average of the hourly-outputs for the whole year-long dataset. One day (24 hourly outputs) of one 3D+time variable is initially contained in a ~280 GB netCDF file. This was (nearly) as simple as:
ds = xr.open_dataset(file_in, chunks=chunks) ds_processed = ds.mean("time_counter") ds_processed.to_zarr(output_zarr, mode="w")
This was done on temperature and salinity, first by day (launching a bunch of dask clusters — two per node — , each finishing in about 3 hours wall time), then by month and then for the global time period. Stratification was computed using standard “sea-water” routines which relates temperature and salinity to water density (adapted from the CDFTOOLS), using xgcm.diff and xgcm.interp for vertical operations. This took a few minutes to run on a single-node cluster.

2. compute the vertical modes

Horizontal phase speed of the mode 1 semi-diurnal internal tide

An embarrassingly parallel problem: solve a z-dependent eigenproblem on every grid point in the horizontal — roughly 40 millions. We designed a custom class (Vmodes) dedicated to vertical mode computation and projection which is based on xarray objects, and wraps the core of the projection — solving the eigenproblem in numpy — thanks toxarray.apply_ufunc:
vmods = Vmodes(ds, xgcm.Grid(ds, periodic=False), nmodes=10) vmods.store(output_zarr, mode="w")

vmods contains a dataset with the vertical eigenfunctions — phi — and associated eigenvalues (corresponding, e.g., to the horizontal phase speed of internal tide modes — see figure), plus coordinates and metrics terms:

For this computation, we deployed a dask cluster with 14 workers with 10 GB ram each on two 28-core nodes, using dask-mpi. The task completed in roughly 11 hours, and was performed sequentially over segments in latitude (100 points each) to prevent the workers memory from blowing up. We could probably get rid of this “undasky” serialization by deploying à cluster around 40 times as big, but did not took the time to investigate that path.

3. project the dynamical fields onto the vertical mode

This is a vertical integral, so a reduction operation, performed independently over every time steps and at each of the (8354 x 4729) grid points. For instance, projecting the pressure field (used to estimate the SLA in the previous figure) boiled down to:dm = vmods.ds xgrid = xgcm.Grid(dm) dens = eos_sigmai(ds_data["temp"], ds_data["salt"], dm["depth_c_m"]) pres = xgrid.cumsum(dens * dm["e3t_m"], "Z") pmod = (pres * dm["phi"] * dm["e3t_m"]).where(dm["tmask"]).sum("z_c") pmod.to_zarr(output_zarr, mode="w")
This operation virtually involves, at some point, a Nx*Ny*Nz*Nt*Nm DataArrayKHowever, because this operation was quite intensively reading the input data, we — again followed the strategy that distributed the calculation (over time and y-segments) sequentially rather than having very large clusters. In the end, we needed a 2-node cluster for the pressure (reading T and S in inputs) and a 1-node cluster for the horizontal velocity, both tasks completing in less than two hours for one day of projected data.

An illustration of the results

Let’s take a look at the surface signature of internal tides on the sea level anomaly (i.e. the pressure perturbation at z=0) — but bear in mind we are dealing with full 3D+time fields here. We are now able to decompose this (which is band-pass time-filtered around the tidal frequency):

Sea Level Anomaly associated with the baroclinic internal tide (m=0 mode removed; snapshot; in cm)

into this:

Sea level anomaly associated with the first few vertical modes (snapshot)

where the effect of the decomposition is obvious. One important aspect is that the “filtering” effect (horizontal scale selection) results from a vertical projection, and that each mode has a distinct vertical structure associated with distinct horizontal length scale and — to some extent — dynamics. That is the reason why vertical mode decomposition can be a very powerful tool, in strong connection with analysis of remote (satellite) measurements of its surface.

Problems and challenges

As a relatively new practitioner of dask, these calculations raised a substantial number of challenges. Some of theses challenges involved understanding the interplay between task ordering, memory usage, rechunking schemes, and I/O, while mitigating scheduling load. These are fairly standard to dask distributed calculations but were exacerbated here by the shape of the dataset. The most important challenge was related to large fluctuations of data reading performance whose origin we were not able to diagnose (difficult on HPC systems without substantial support from sys admins) nor mitigate appropriately. An important remark, with this regard, is that we were reading our data from netCDF files with a quite odd chunking (t=1, z=1, y=10, x=Nx=8354). Adequate tools to diagnose such situations and mitigate them would be beneficial.

Of course lots of details were omitted here, involving some interpolations and other manipulations, all of them making great use of the xgcm/xarray tools. You can find more detailed informations, dedicated library, python scripts and notebooks in the github repo.

What’s next?

At this stage we have 6 month of data projected, which is already a nice opportunity for investigating the dynamics and/or playing around with some reconstruction methods in the context of SWOT — to name a few. Our intention is obviously to (complete and) make this dataset available to the community, so feel free to contact us!

Credits

This work was done in collaboration with Aurélien Ponte, in close collaboration with Aurélie Albert & Julien Le Sommer, and made possible by the work of the MEOM group at IGE and Ocean Next. Access and analysis of eNATL60 outputs were carried on the Occigen supercomputer at CINES.