Seismic Fault Detection

Published in

Data Analysis Center

7 min readJun 22, 2022

In previous articles, we have covered several tasks of seismic interpretation: horizon detection and facies segmentation. Another important step of interpretation is faults detection. We will describe the task in details, talk about existing approaches to solve it and shortly highlight our solution.

What is a fault and why faults are important?

Faults are vertical displacements of rocks caused by rock-mass movements. Experts use horizons (the layer boundaries) and faults to divide seismic cube into regions with similar properties for the following reservoir analysis. Such partition is called structural model. Let’s show an example of a field model built on the basis of horizons and faults:

Structural modelling is necessary to understand oil migration routes and find potential deposits: this step is essential for correct placement of producing wells. Faults can disrupt the structure of the layers and destroy existing reservoirs or create new ones.

The fault destroyed the reservoir and oil escaped up through the layers

The fault created an obstacle to the migration of oil, trapping it in one place

How do experts detect faults now?

On every tenth seismic slice the expert labels the fault with nodes that are connected by polyline. Such lines are called fault sticks.

Further, fault sticks are interpolated and define a surface in 3D.

Fault surface constructed by fault sticks

To make things easier, experts use geological attributes (semblance, coherence, dip, azimuth, etc.) as an auxiliary tool to highlight suspicious regions. For example, dark regions of Marfurt semblance attribute correspond to potential faults.

Manual labeling has a few disadvantages, though:

It is cumbersome
The complex structure of faults can make it difficult to assign sticks to one surface
Because of the sparseness, the annotation is approximate and rough
Often, to save time, experts examine only a small area of the cube and not the entire volume.
The annotation by several experts will be subjective and inconsistent

The desire to automate this procedure as much as possible is obvious.

What approaches exist?

As we noted above, geological attributes help to highlight possible faults. Unfortunately, they can’t provide for an automatic procedure since any attribute highlights not only faults but many other anomalies like a noise or deviations from horizontal parallel structure. (e.g. Gibson et al., 2003, Hale, 2013, Mechado et al., 2016)

With the rise of ML, a lot of papers and tools have appeared that use neural networks to label faults (e.g. Zheng et al., 2014, Xiong et al., 2018, Cunha et al., 2020, Wei et al., 2022). Indeed, creating an attribute with probabilities of fault at each point is exactly the task of binary semantic segmentation. If the probability cube is enough for us, then we just need to train the neural network. But what are the challenges with pixel-wise segmentation of faults?

Training dataset: an accurate model demands extensive and diverse train dataset with exhaustive annotation. Since compiling such a dataset requires a lot of time and resources, synthetic datasets are often used. Unfortunately, making algorithms for the synthesis of reliable synthetic data is a hard task and models trained on such data often work incorrectly on real cubes. Another option is to partially label the cube and train the model only on that cube, but this approach has no right to be called “fully automatic”.
Huge dataset size: we cannot place the entire cube of tens of gigabytes in the GPU memory so have to train and inference on 2D/3D crops of an appropriate size. To provide examples of crops with and without fault we must describe appropriate sampling procedure.
Preprocessing and augmentations: one should choose wisely them to make the neural network more generalizable. Neural network without inputs normalization can fail on the new examples with other distributions of values. Augmentations can help to extend a small training dataset, but some of them can break physical properties of seismic data.
Training procedure: typical problems of neural network training, such as choice of the architecture and hyperparameters, overfitting, normalization procedure, etc.
Validation (QC): there are two possible ways to control the quality of the resulting solution. The first is to compare the result with ground truth. But reliable ground truth is available only for synthetic data: experts labeling is almost always noisy. The second way is to check that predicted faults are situated at the right places, but there is no appropriate validation metric to measure it. Such kind of metric for horizons is shown here.

Nevertheless, all these problems, except the last one, are solvable. Recently, there has been more and more open data from real fields, which are either already labeled or can be labeled. For training, we have all the necessary tools in our open-source libraries SeismiQB and BatchFlow. As for the validation metric, it is important not only for ML models but for any faults detection results. That is why the construction of the metric is an interesting task in itself.

In many applications probability cube is not enough and experts need individual fault instances to be extracted from model prediction. Detecting separate fault surfaces is more difficult and is not as easily described as a task in which machine learning can be applied. Usually, surfaces are constructed on the basis of the labeled cube, but there are not enough articles on this topic.

How does our solution work?

Many software products (Petrel, Geoplat, OpendTect) claim to provide tools for faults detection. However, they still have different drawbacks, for example:

some of them require of partial annotation,
other output prediction in a form of the cube with probabilities instead of surfaces
resulting surfaces are often imprecise, badly interpolated, or self-intersecting.

We develop our solution to produce surfaces for quite some time as a result of constant interaction with expert geophysicists. Even such a seemingly simple step as defining a task turned out to be not so simple.

Like a horizon, each fault is a surface in 3D volume of the seismic cube. The target of the faults detection task can be:

labeled cube of the same size as an initial cube with label 1 (fault) and 0 (non-fault) or with probabilities of fault in [0,1] range. Such a scenario doesn’t propose object detection, i.e. splitting instances of different fault surfaces.

separate fault surfaces as clouds of points

approximated faults surfaces as fault sticks (see below) or triangulation

The choice of the target depends on the way to use faults: in our practice we are trying to get surfaces as fault sticks because of the convenience of working with them.

The major difference between faults and horizons is orientation: labeled horizons are always a depth function of inline and crossline, but the faults are vertical surfaces. At first glance, all you need is to transpose the cube and apply horizon detection tools but there are key differences:

Faults can be along inlines, crosslines or mixed, so a priori one doesn’t know how to transpose the cube
A particular horizon can be annotated on most of the field traces, while each fault is located only in some bounded region
Faults can intersect with each other

Our solution has the following stages:

train a model in Encoder Decoder fashion on 2D crops from several real cubes with experts annotation to get probability attribute cube
use trained model to obtain a cube of fault probabilities
perform postprocessing: zero-out the predicted volume on field’s bounds, smooth out and skeletonize the prediction
apply iterative procedure of merging connected components on 2D slices to split the predicted volume into separate surfaces

Example of training pipeline with open-source libraries SeismiQB and BatchFlow

The model together with horizon detection procedure allows to get the basis of the structural model.

Horizons and fault constructed by our ML models

More detailed description of the complete solution we will provide in the following articles.

Summary

Fault detection is not as easy as it might seem: challenges arise at every stage, from the problem formulation to the result evaluation. In this article, we have highlighted the key challenges and shown our results. We will continue to improve our solution, while sharing more details about its internals. Stay tuned!