Benchmarking BRGM’s EFISPEC3D earthquake simulation application on AWS

Introduction

Published in

Intel Tech

5 min readMar 7, 2022

The Bureau de Recherches Géologiques et Minières (BRGM, French Geological Survey) is France’s leading public institution for Earth Science applications for the management of surface and sub-surface resources with a view to sustainable development.

This post reports on the performance of EFISPEC3D[1], BRGM’s HPC application to compute earthquake scenarios on AWS leveraging powered by Intel® Xeon® Scalable processors and the Intel® OneAPI Toolkit to accelerate computing.

EFISPEC3D is a scientific computer program that solves the three-dimensional equations of motion using a continuous Galerkin spectral finite element method. The code is parallelized using the Message Passing Interface (MPI).

Performance results

The interest here is to present the use of the AWS services to carry out BRGM’s prediction simulations. A typical simulation lasting about 300 days on a single core will be run in 25 min on AWS Cloud, with using 27k physical cores and no hyperthreading.

The 27k cores used for this performance report are based on c5n.18xlarge instances with Intel® Xeon® Platinum 8000 Series processor (Skylake-SP) as well as m5zn.12xlarge instances powered by custom Intel® Xeon® Scalable 2nd generation processors (Cascade Lake) at 4.5GHz. Both Amazon c5n and m5zn instances use Elastic Fabric Adapter (EFA) to support inter-nodes communication.

EFISPEC3D is implemented with Fortran08 and Intel MPI to solve 3D wave equations by the spectral finite element method.

Strong scaling simulations

In the first experiments, we test the strong scaling of the EFISPEC3D application on AWS c5n.18xlarge and m5zn.12xlarge instances.

A simulation is done in 101-time steps of size 1.0E-03. Table 1 shows some information about the different simulations made on Amazon C5n instances:

**Table 1: Information about simulations done on c5n.18xlarge instances**

**Figure 1: Strong scaling on AWS c5n.18xlarge instances**

**Figure 2: Strong scaling on AWS m5zn.12xlarge instances**

The results present the elapsed times for the time loop computation (in seconds) of EFISPEC3D using blocking or non-blocking communications, and the achieved speedups with respect to the simulation made on a single physical core. We can notice that the strong scaling on the cluster of AWS with EFA is 67% at 27036 cores, more particularly for the non-blocking communications version for which a speedup of 18030 is reached on 27036 physical cores (see Figure 1). Such a speedup allows to compute a standard earthquake simulation in about 25 minutes instead of 300 days on a single core.

When blocking communications are used, we can see the drop in performance around 2048 cores due to the incompressible time of send receive MPI calls.

7M hexahedra simulations

During this second experiment, we enabled reading and writing to the Amazon FSx for Lustre file system in the EFISPEC3D application as the application uses data writing via Intel MPI. We then chose to check the performance of Amazon FSx for Lustre.

The objective was to test the performance of writing results to a file system when 2048 tasks write results to the file system at the same time. For this test, we have used a cluster of 57 c5n.18xlarge instances.

**Figure 3: Amazon FSx with Lustre performance to write file with MPI.**

Impact of file system parameters. Lustre on the bandwidth measured for the reading phase

of the data (206GB). The stripe size varies from 1MB to 10MB and the striping factor varies from 5 to 80.

In the chart Figure 3, each stick represents one execution with different configuration of Amazon FSx for Lustre parameters. For example, 5S — 1M means the test was calibrated with 5 Stripes factor and 1 Megabyte as size of bloc of data.

We notice that the best performances will be reached with EFISPEC3D when we set the file system with a stripes factor of 80 and block sizes of 10 Megabytes.

**Figure 4:** General bandwidth in gigabytes per second for writing files during the post processing phase.

During the post processing phases of the EFISPEC3D application, the bandwidth went up to 10.3 GB/s to write a 206 GB result file.

Conclusion

During the R&D and benchmarking phase, we were able to see how easy it is to deploy, install, and configure our HPC environment using AWS ParallelCluster. We have also seen how quickly C5n and M5zn instances are available in all AWS Regions across the globe. The M5zn instances are quite different in a number of respects. They have fewer faster cores and are based on Intel Cascade Lake architecture.

During the strong scaling tests, we were able to observe very good performance of the EFISPEC3D application which demonstrates the power of the C5n and M5zn servers for the HPC world. A typical simulation lasting about 300 days on a single core will be run in 25 min on AWS Cloud, with using 27k physical cores and no hyperthreading.

Concerning the Amazon FSx for Lustre file system, we reached a bandwidth higher than 10GB/s which more than met the requirements of our application with intensive disk writing by all compute nodes.

¹ EFISPEC3d is featured in a HPC Workshop. The workshop guides the user through the process of deploying a HPC cluster and running the EFISPEC3D software. It can be found at this location:

Authors and Contributors:

Florent De Martin (BRGM) — Seismologist
Faïza Boulahya (BRGM) — Data Scientist
Steve Messenger (AWS) — Senior HPC Specialist Solution Architect
Gilles Tourpe (AWS) — Business Development Manager HPC
Diego Bailon HUMPERT (Intel) — Sales Development Manager AWS EMEA
Loubna Wortley (Intel) — Sales Development Manager AWS EMEA
Lilia Ziane Khodja (ANEO) — Consultant Expert HPC,
Damien Dubuc (ANEO) — Solution Architect Expert HPC

Notices & Disclaimers:

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.
Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates. See backup for configuration details. No product or component can be absolutely secure.
Your costs and results may vary.
Intel does not control or audit third party data. You should consult other sources for accuracy.
Intel technologies may require enabled hardware, software or service activation.
© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.