Augmenting Radar Data with Shifts & Flips

Hezi Hershkovitz
Gradient Ascent
Published in
9 min readNov 18, 2020

This is the 3nd article in our MAFAT Radar competition series, where we take an in-depth look at the different aspects of the challenge and our approach to it.
If you want a recap, check out previous posts:
the introduction and the dataset.

As we already discussed, the two major issues with the provided dataset are:

  1. The small size of the training data
  2. The imbalance of the dataset

In order to improve the result of the model, we needed to find more relevant data. One of the common ways of doing it in machine learning is through the use of Data Augmentation, which is a technique that is used to increase the amount of data by adding slightly modified copies of already existing data.

We believe that most of the improvement that we got on the model’s performance was due to data augmentation. In order to train a neural network there is a need for a lot of data samples, and 6000 samples that were given in the original dataset is just not enough to let a model generalize well. Especially since we had intentions to train bigger models, we had to come up with a way to feed the model with more data.

How to augment the data

Let’s give the the data that we got a closer look to see how we can augment it.

Data from track #241

Shifts

I will explain based on a sample one of one track (track #241), though the same explanation holds for all other tracks in the training dataset.

In track #241, we can see that is composed from 11 segments, all of them have similar characteristics (low SNR, sensor type, data, and same geolocation type and id).

Plotting a complete single track is actually quite easy, since it is a code that was provided by the original code that came from the competition organizers.

Track #241 divided to its respective segments

The image above shows one complete track from the training dataset when combining all the segments. The red rectangles are the separate segments that are included in this track. The white dots are the ‘doppler burst’ which represents the center of mass of the tracked object.

With the help of the ‘doppler burst’ white dots, we can quite easily see that the track is composed by adjoined segments, i.e. segment id 1942 is followed by 1943, then 1944 and so on.

The fact that the segments are next to each other allows as to use shifts in order to create “new” samples.

Shifting one segment 3 times

The image above shows shifts of 8,16,24 respectively. We are shifting the segments only along the x-axis ( the ‘long-time’ axis). For each sample we can make up to 31 shifts

Why not shift along the y-axis

The y-axis (radar ‘short-time’), is a cyclic axis. It means that along y-axis points that are on row 0 are “the same” as those that are on row 127. This can observed in the presence of the “yellow patches” that exists on the bottom and the top of the segments (which we were told represent the vegetation in the scene that the radar ‘sees’). We can see that the “yellow patches” are mostly concentrated on row 0 & 127, and that the samples are becoming less and less yellow as we go from row 0 up, or from row 127 down.

For this reason we chose not to attempt to shift the segments on the y-axis, as we though that the samples that such a shift would generate might not represent ‘valid’ radar samples.

Flips

Very similar to images data augmentations, we can simply flip the sample vertically or horizontally in order to generate additional samples, as shown in the image above.

(The piggy was added to help you visualize the flips, and it represents the animal that is hidden inside the radar data 😉 )

Data augmentation considerations

Avoid leakage

We need to be careful not to include validation data in the training set. A track may include segments that we chosen to be part of the validation set. As a result, a shift of the segment that adjoined to a validation set will ‘leak’ validation set data into the training set.

A track separated to segments. Since segment #7 is part of the validation, we can’t use shifts generated from 6→7 or 7→8 in the testset, as it will cause data leakage

So for example, in our track #241, let’s suppose that segment #7 is part of the validation set. Then of course that we can use segment #7 itself to generate shifts, nor can we use its adjoining segment #6 to generate the shifts, since it will cause parts of the validation set to be included in the training set.

We added a column is_validation for each segment in the dataset. The code that loops of all adjoined segments pairs in a track and decide whether it is ok to generate a new shifted/flipped sample out of them goes is as follows:

segment_idxs = list(data[data.track_id == track_id_t].index)
segment_idxs = [(x, y) for x, y in zip(segment_idxs, segment_idxs[1:])]
iq, burst = concatenate_track(data, track_id_t, snr_plot='both')for seg_id in segment_idxs:
columns = ['geolocation_type', 'geolocation_id', 'sensor_id', 'snr_type', 'date_index', 'target_type']
ok_to_add = True
for col in columns:
if data.iloc[seg_id[0]][col] != data.iloc[seg_id[1]][col]:
ok_to_add = False
if data.iloc[seg_id[0]].is_validation or data.iloc[seg_id[1]].is_validation:
ok_to_add = False

As you can see from the code above, when checking a segment adjoined pair, we are checking if any of them is part of the validation set. In addition we are checking if any of the other available parameters are different (i.e. the geolocation type and id, the sensor id, SNR type, and on).

The maximum number of all samples after applying 31 shifts and 3 flips (horizontal, vertical and horizontal-vertical) on all 6656 samples in the original dataset is: 6656313 = 619008 samples. With the above code that skips the validation set segments and segments which have mixed parameters we ended up with about 570K samples that we could use for training our model.

That’s a huge increase on the original 6656 samples that were provided in the original dataset. It allowed training a larger model (we use Alexnet) to get much better results than the base model.

What to augment: IQ matrices or the spectograms?

In theory both are possible. The spectograms are 2D matrices, so they can actually be treated as images and you can use libraries such as keras ImageDataGenerator to do the augmentations. We did some preliminary work on that, and it seem to work.

However, eventually we decided to make the augmentations directly on the IQ matrices, as we thought that it might be required for more complex models that we were testing such as the scalograms.

It is important to note that when doing any transformation of the IQ matrix, it is also important to do the exact same transformation on the ‘doppler burst’ data of the segment: When flipping the IQ matrix, you also need to flip the ‘doppler burst’, and the same goes for shifts.

Memory Limitations

In hindsight, coming up with the idea on how to augment the data was the easy part. The hard part was actually being able to generate the augmented data while fitting it into our available resources. As we were going to discover we need to make changes to our code and pipeline in order to be able to train the model with all the augmented data.

  1. Running just 3 shifts (8,16,24) run on colab, showed improved results. Trying to run 31 shifts crashed on Colab because training run out of RAM.
  2. We switched to running on bigger instances on AWS, but even that was not enough for running the whole shifts (500K samples).
  3. We refactored the code to use test time data augmentation processing: Instead of preprocessing the augmentations and storing them in all RAM, we generate the shifts and flips ‘on the fly’ during the test inside the DataLoader.

Generating augmentations on the fly

In order to save RAM, we changed the way we generate the augmentation samples. Originally, we just pre-calculated the new segments IQ matrices and ‘doppler burst’ and added them to the training set. However each segment contains 128*32 complex numbers, and as mentioned previously, holding 500K of these samples in RAM is resource limitation.

As we were working on this challenge on a budget it was important for us to be able to run the training with as little resources as possible, preferably on Colab where we were able to train the model for free.

In order to be able to train using all 500K samples, we needed to move the actual generation of the augmented data to the DataLoader getitem() method. It was achieved by doing a preliminary stop of adding all the additional augmented samples, but instead of saving the entire IQ matrix for each of them, we only saved a ‘augmentation_info’ dictionary for them as such:

augmentation_info = [{
from_segments: [1455,1456],
type: 'shift'
shift: 14
},
{
from_segment: 1455,
type: 'flip'
mode: 'vertical'
},
{
from_segment: 1455,
type: 'flip'
mode: 'horizontal'
}]

The amount that such an object requires is several orders of magnitude lower than what is needed for saving the IQ matrices.

In the DataLoader we check the augmentation_info field, and if it exists and contains required augmentations, we process them one after the other on the original sample, in order to generate the new shifted/flipped sample.

Few important things about the ‘augmentation_info’ dictionary:

  1. The transformations inside the ‘augmentation_info’ array are processed by order.
  2. If there is type=’shift’, there will only be one, and it will always be the first in the array.
  3. The first transformation in the array is the one that is used to generate the initial IQ martix, that is used in the following transformation

Each new sample will have a maximum of 3 transformations in the ‘augmentation_info’ dictionary. The combinations for the possible transformations are:

  • [shift,vertical,horizontal]
  • [shift,vertical]
  • [shift,horizonal]
  • [veritcal,horizontal]
  • [vertical]
  • [horizontal]

We can then use this data in the DataLoader as such:

class DS(Dataset):
def __init__(self, df):
super().__init__()
self.df = df
def __len__(self):
return len(self.df)
def __getitem__(self, idx): data_inner = self.df.iloc[idx].copy() # use iloc here because must get absolute row position if data_inner.iq_sweep_burst is None: iq_matrix = None
doppler_vector = None
for augment_info in data_inner.augmentation_info: if augment_info['type'] == 'shift':
iq_list = []
doppler_list = []
from_segments = augment_info['from_segments']
shift_by = augment_info['shift']
for i in from_segments:
iq_list.append(self.df.loc[i]['iq_sweep_burst']) # use loc here because we need the actual segment id (by index)
doppler_list.append(self.df.loc[i]['doppler_burst'])
iq_matrix = np.concatenate(iq_list, axis=1) # 2*(128,32) => (128,64)
doppler_vector = np.concatenate(doppler_list, axis=0) # 2*(32,1) => (64,1)
# cut the iq_matrix according to the shift
iq_matrix = iq_matrix[:,shift_by:shift_by+32]
doppler_vector = doppler_vector[shift_by:shift_by+32]
if iq_matrix is None and augment_info['type'] == 'flip':
from_segment = augment_info['from_segment']
iq_matrix = self.df[from_segment].iq_sweep_burst
doppler_vector = self.df[from_segment].doppler_vector
data_inner.iq_sweep_burst = iq_matrix
data_inner.doppler_burst = doppler_vector
# convert to structure supported by preprocess method
data_inner_o = {k:[v] for (k,v) in data_inner.to_dict().items()}
data_inner_o['target_type'] = np.asarray(data_inner_o['target_type'])
# do preprocess
data = specto_feat.data_preprocess(data_inner_o)
data['target_type'] = np.array(int(data['target_type'][0]),dtype='int64')
# augementations: do flips (if needed)
if 'augmentation_info' in data.keys():
for augment_info in data['augmentation_info'][0]: # the [0] is because we added [] in the data_inner_o
# print(f"augment_info:{augment_info}")
if augment_info['type'] == 'flip':
if augment_info['mode'] == 'vertical':
data['iq_sweep_burst'] = np.flip(data['iq_sweep_burst'], 0)
data['doppler_burst'] = np.abs(128-data['doppler_burst'][0])
if augment_info['mode'] == 'horizontal':
data['iq_sweep_burst'] = np.flip(data['iq_sweep_burst'], 1)
data['doppler_burst'] = np.flip(data['doppler_burst'])
label2model = data['target_type']
data2model = data['iq_sweep_burst']
data2model = np.expand_dims(data2model.squeeze(), axis=2) # (132,28,1) return torch.from_numpy(data2model.copy()), torch.tensor(label2model.astype(np.int))train_set = DS(full_data[full_data.is_validation==False])
train_loader=DataLoader(dataset= train_set, batch_size = batch_size, shuffle = True, num_workers = 2)

Using this method of calculating the augmentation in the dataloader we were able to feed the model the entire 570K samples, which allowed as to improve the result up to 0.82 AUC on public dataset.

The next improvement in the result originated from using pytorch IterableDataset. It required quite a bit change to the pipeline and additional code refactoring, so it is better if we explain it in a different, separate post.

Conclusion

In this article I have explained one of the most important factors in improving the model result, the data augmentation. Next articles we will share additional methods, both architectural and functional, that we applied in order to try to improve the result even further.
Our code is in github, you can access it here.

Stay tuned!

--

--