SeismicPro: bringing AI solutions to Seismic Processing

Published in

Data Analysis Center

7 min readAug 17, 2021

Technological breakthroughs and discoveries in the field of ML&DL give an opportunity to optimize and speed up workflows in industries with complex production processes. One of such domains is Oil & Gas exploration, which consists of numerous complicated yet routine operations that aim at locating oil reservoirs.

However, conventional deposits are depleting, and existing tools and workflows are not ready for the severe challenges brought by unconventional reserves. This is why Oil & Gas is one of the best fields to exploit the latest advances in deep learning to achieve revolutionary improvements for the whole industry.

The first step to oil production

We’ve already told you a lot about seismic interpretation and now it’s time to cover the stage that precedes it — seismic processing. However, before diving into the details of how and why we are optimizing it using deep learning, let’s talk a little about seismic processing and what data it operates with.

Raw seismic data is collected from thousands of receivers that cover hundreds of square kilometers. Each of them records earth fluctuations caused by vibration sources located on the surface. The goal of seismic processing is to reconstruct the subterranean structure by accurately transforming and combining the recorded waves into a post-stack seismic cube.

Just to clarify: seismic waves, signals, and unstructured or pre-stack seismic data are all synonyms of the data used during the seismic processing stage :)

Flaws of current approach

The labor of geophysicists gets more complicated every year: survey sizes are increasing, simple oil-saturated reservoirs are being depleted and replaced with thinner ones located deeper. Since almost all the procedures are performed and validated manually, the processing of new surveys slows down significantly and doesn’t meet the industry demands anymore.

For example, one of the first procedures — acquisition geometry correction is about detecting receivers or sources misplaced while seismic surveying. To find them, experts visually evaluate each seismogram and have to do it thoroughly since even a single misplacement negatively affects all further steps. Thus it takes up to a day to check a dozen of thousands of gathers, which results in weeks of labor-intensive work for large surveys.

Another example is ground roll attenuation. The objective is to reduce the noise generated by near-surface waves that mask the seismic signal. To do so, an expert adjusts filter parameters using a small subset of gathers and then applies them to denoise the whole survey. However, since subsurface conditions may vary greatly across the field, such extrapolation is inadequate at times and distorts the overall seismic profile, causing certain horizons in a post-stack cube to blur or even disappear!

The rest of the operations in the processing pipeline perform elaborate data transformations ranging from various types of noise attenuation to seismic wave velocity analysis.

But all of them share common flaws: they are iteratively fine-tuned to meet geological conditions and highly rely on visual quality control, thus requiring a lot of experts’ time and effort.

Alas, despite all the endeavors that experts put into their work, sometimes certain stages have to be performed anew due to either mistakes or insufficient quality of the result. If not, wrong estimates of oil deposits and their locations will result in huge financial losses. Deep learning is the key not only to speed up seismic processing but also to obtain predictions comparable to an ensemble of experts not relying on a single specialist’s opinion. This is the reason we’ve developed SeismicPro.

Revising processing with cutting-edge tools

SeismicPro is an open-source library that aims to help geophysicists to simplify and accelerate seismic processing using deep learning.

The framework allows to efficiently load pre-stack seismic in SEG-Y format, transform the data in a parallel way and combine processing functions into readable pipelines, utilizing a wide range of neural networks for geological-related tasks in just a few lines of code. Moreover, SeismicPro provides convenient tools to assess the model's quality, such as a metric for seismic denoising model and quality maps for first break picking.

Before we move forward into examples, let’s discuss the framework’s structure:

The central class of our library is called Survey, and it represents a single SEG-Y file. This class does not store trace data but only a requested subset of trace headers and general file meta. To get access to seismic traces, a Survey may generate an instance of a Gather class which describes a collection of seismic traces that share some common acquisition parameter and contains a bunch of processing actions.

General information about the SEG-Y format and its structure including headers and binary layout is briefly described here. Or you can read its detailed description in SEG-Y standard documentation.

SesimicPro architecture is designed to process batches of gathers for convenient training of neural networks.
SeismicIndex enumerates gathers in a survey or a group of surveys and allows iterating over them; SeismicBatch contains methods for joint and simultaneous processing of small subsets of seismic gathers. SeismicDataset generates batches of the SeismicBatch class according to the index provided. Pipeline combines all these classes and implements an interface to process the gathers in a parallel way and train neural networks. In a few minutes, you’ll see an example of their usage in practice!

Basics of SeismicPro

Finally, it is time to dive into the code! In the beginning, we will take a glimpse at the core methods of our library, and then construct a pipeline that trains a model for the real-world task of finding the times of first breaks.

First of all, we need to describe the field that we are going to work with using the Survey class:

Here, header_index and header_colsspecify which trace headers to load:

header_index describes the headers used to group traces into the gather,
header_cols defines all other headers that might be useful during further processing.

You can find all available trace headers in segyio documentation. It is possible to load all of them but is not recommended due to performance reasons.

The loaded trace headers are stored in the headers attribute as a pandas DataFrame.

Calling survey.sample_gather method generates a random gather, let’s take a look at it being sorted by offset:

A randomly selected common source gather sorted by `offset`

AI-based solution to real-world seismic task

After a brief introduction to SeismicPro, you know everything you need to start tweaking a first break picking problem.

In a nutshell, the objective is to find a moment when the signal arrives from a source to a receiver. It is one of the initial steps of seismic processing that affects many further procedures and thus should be carried out accurately. You can find all the theoretical background about the task here, now let’s dive into the code.

First of all, we are going to create instances of Survey and SeismicDataset:

The attentive reader may recall that SeismicDataset is created from SeismicIndex. However, if you work with a single Survey, it can be passed directly to the SeismicDataset constructor and SeismicIndex will be created implicitly.

Here is an example of a Pipeline for the first break picking task. It combines all methods from processing procedures to model training into a suitable format and allows for executing all of them iteratively for small pieces of data called batches.

Such pipelines are easy to follow even for a person without programming experience: first, we define a model, then load the data, process and reshape it, and finally, run a single step of model training.

You can learn more about the ways to construct pipelines and define neural networks in the medium article and on the GitHub page.

After the model has been trained, it is crucial to properly assess its quality. Usually, to ensure the correctness of the predicted first arrivals, experts repeatedly scan individual seismograms to evaluate the results. This approach requires a lot of time and attention from a specialist, while does not provide a complete picture of the quality within the entire field.

We have developed a new approach to quality control by constructing a visual map of the field where red areas indicate either the presence of incorrect picks or the zones with complex upper layer conditions. This map significantly reduces the workload of human experts as they only have to look through a small number of highlighted areas.

Here we glimpsed at one of the tasks we are currently working on. Meanwhile, many others are out there to be solved, and SeismicPro provides a wide range of tools to tinker with them.

Summary

The Oil & Gas industry is a perspective field to gain enormous benefits from implementing DL-based solutions. Modern neural networks incorporate experience of multiple geophysicists, work swiftly and tirelessly, thus reducing companies’ expenses and allowing experts to deal with other, more sophisticated problems.

Responding to an industry request, we have developed an open-source SeismicPro library which provides a convenient API to work with pre-stack data in SEG-Y format, transform it in a massively parallel way, and easily develop neural networks for any preprocessing procedure to eventually produce a post-stack seismic cube.

Don’t even dare think this is over; other interesting articles are on the way. Stay tuned!