Hands-on the desk with First Break Picking

Published in

Data Analysis Center

6 min readJan 19, 2024

As an old adage goes, “If there is something wrong with your stack, check the first breaks”.

First break picking is a task of tracking a time when a receiver starts to record a meaningful signal from a source. It is one of the initial steps of the seismic processing pipeline, and only the knowledge of first break times allows one to model the velocities and thicknesses of the first layers of the earth. You can find more details on this stage in the seismic processing pipeline in our previous feature. Usually, it requires months of painstaking manual work from an experienced geologist to carry out picking of the whole survey.

If you have been following us closely, you probably know we’ve already touched on this task and discussed why conventional methods aren’t enough for success. We captured the working principle of auto pickers and why even 1D UNet can handle the task of first break picking (or FBP) substantially better. For more details, please follow this link. Now it’s time for us to introduce our 2D first break picking approach based on a linear moveout correction and a neural network.

Let us provide a little more context to the narrative. Our journey to this method started with an algorithm we’ve been experimenting with for some time. It was designed as an extension of the existing 1D algorithm to two spatial dimensions.

It follows a two-staged recipe:
1. 1D UNet processes each trace separately at the first stage and returns a binary mask following a first break.

2. In the second stage, 2D UNet refines the obtained mask by accounting for spatial information, see Figure 1.

Figure 1: two-staged FBP approach. From left to right: shot gather, binary mask obtained after the first stage, refined mask after the second stage, gather with first breaks.

While this approach allows for the successful elimination of the artifacts of noise and phase inconsistency, it comes with a couple of drawbacks. Namely, in the second stage, the model does not rely upon trace data explicitly, which makes the result harder to interpret and debug. Secondly, the 2D model needs to be relatively large and thus wheezingly slow, as model inputs cannot be substantially cropped along the time axis, and have to be broad enough along the channel axis to represent the trend across different reclines. As a result, a train plus inference stage could take up to 4 hours to complete on a large field. Still much faster than the manual approach, but there is definitely room for improvement!

We’ve experimented with various ways to shorten this procedure down to one 2D model. However, feeding signal data instead of a binary mask to a 2D model is a significant complication. The more sophisticated the data is for the model, the greater should be the size of the training dataset and the number of model parameters.

But FBP datasets are relatively small, namely 15–50 shot gathers, which is not nearly enough for the proper training of a sufficiently large model. Training datasets for FBP are mostly prepared manually with little help from auto pickers, and to make the process of first break picking efficient and less laborious, training data preparation should not be a mammoth task. Moreover, FBP models should be trained (or finetuned) for each survey individually, as every field has its specific traits. In big companies, first break picking is performed routinely and frequently, so the preparation of new datasets should not become yet another bottleneck in the seismic processing pipeline.

However, one scheme met our requirements, and thus we’re eager to describe a radiant end-to-end approach utilizing linear moveout correction and only one UNet, see Figure 2.

The idea is straightforward:
1. To simplify the data we perform linear moveout correction, after which the first breaks are roughly located along a straight line, giving the model a solid prior on where they might be.

2. As the first breaks are located along a straight line, the crop size on time and channel dimensions can be substantially reduced, allowing us to also shrink the number of parameters and time for train and inference.

Figure 2: shot gather with first breaks(left), shot gather after linear moveout correction applied(right)

Linear moveout correction is an operation of straightening up the hodographs according to the velocities of refractors, which are piecewise linear functions of offset. To calculate refractor velocities, we utilize the first breaks available in the training set, and the fact that after a certain offset, the first useful signal is always coming from the refractors. At the inference stage, calculated refractor velocities are interpolated over the whole field, see Figure 3.

Figure 3: refractor velocity field on the left, picked velocities for one shot gather on the right.

However, introducing the model to a linear prior might result in not sensitive enough predictions. To tackle this issue we introduce a specific augmentation method, which is simple yet effective. At each training step, roll up or down a random set of consequent traces for a random time with some probability, see an example in Figure 4.

This little trick allows the model to learn how to properly react to abrupt changes in the scenery and also prevent itself from overfitting. Have a look at Figure 5 to see the effect of employing augmentation in the training process.

Figure 5: The effect of introduced augmentation. Blue dots correspond to no augmentation.

This approach allows us to perform the whole stage of FBP in less than an hour for almost any field using a single GPU! For instance, model training with 13k traces takes 15 minutes, and inference on a survey with 18 million traces takes only about 40 minutes.

The last step is to ensure that the quality of obtained first breaks meets certain requirements. Depending on whether it’s the minimum or maximum phase we’re interested in, we want to make sure the picked phase is consistent across the field. Moreover, we don’t want any outliers to be present in the final picking.

To tackle this challenge, we provide various interactive QC maps, that detect outliers from expected time, phase and amplitude inconsistency, calculate signal correlation around the first break across the gather, and even guess an offset after that the first breaks are most likely to diverge from the expected time. Gathers with deviations are highlighted on the map, allowing an expert to quickly identify the most problematic areas, see Figure 6.

Figure 6: Metric map of first breaks outliers on the left, visualization of the metric for one gather on the right. Outliers are dots beyond the red area, and the metric is calculated as a fracture of outliers in individual gather.

Map construction is also highly optimized: calculation of five different metrics for a survey of 4.5 million traces takes only 2.5 minutes on CPU.

Conventionally to detect anomalies experts had to meticulously wade through a plethora of gathers on a sparse grid, the number of which might tally to thousands for a large field. The proposed approach not only greatly accelerates the time needed to assess FBPs, but also makes estimation more accurate, as it assesses every single gather on a field.

Moreover, the ability to promptly assess FBPs enables one to check the quality of the training set. That is a crucially important step, as any glitches in the training set would result in similar ones during inference. Our approach allows one to automatically screen out any inferior picks, leaving only representative data on board. See Figure 6 for an example of phase inconsistency.

Figure 6: Here we detected the first breaks tracked at max phase instead of min.

To sum up, we’ve shared our most successful approach on first breaks. Being able to carry out this stage swiftly and accurately is highly likely to boost the quality of the final stack and eliminate the sweated labor connotated with it. We hope our experience will save you some time struggling and evoke new ideas for handling your challenges. See ya next time!

Hands-on the desk with First Break Picking

Written by Altynanke