MAFAT Radar Challenge: Solution by Axon Pulse

Radar target classification — is it human or an animal?

Published in

Axon Pulse

8 min readNov 9, 2020

Radars are blooming! Nowadays you can see radars almost everywhere — from the classic homeland security sector in a state level, through local radars for automotive and even for healthcare purposes at home or in your pocket. No wonder MAFAT (Israel Ministry of Defense — Directorate of Defense Research & Development [DDR&D]) has chosen to focus on radars for their annual AI challenge with a record-high cash prize reaching 40K$ in total.

We had the pleasure of participating in that challenge and finished 2nd place (just 0.23% from 1st)! This was such a great opportunity for us to give a glimpse to the unlimited potential mixing radars and AI. We feel like it’s a good time for us to share large parts of our solution and how it was obtained.

In the following article we will walk through the whole process done by Axon-Pulse’s team to win 2nd place. We’ll go from the data inspection and preparation, augmentations, models and finally ensembling models. As far as we know, this was a one-of-a-kind competition in this domain — so there were very little “best practices” or common tested approaches to this kind of data. This makes it even more interesting: let’s start an adventure into a new field of AI.

The Challenge

“The participants’ goal is to accurately classify whether a radar signal segment represents a human or an animal.”

Can you tell a human from an animal? This task is important to secure an area — a village, a farm or military base. In many cases, radar is the most effective available option, both cost wise and performance wise — Regular cameras and IR cameras are expensive, suffer deeply from weather conditions and their range & field of view are limited. Nevertheless — unlike radars, cameras can’t provide the speed of an object, which turns out to be crucial in detecting and classifying it.

Performing such classification using radar is really challenging. The resolution of the radar is too low in order to extract the shape of the target — all you get is “a dimmed shadow” of the target. Moreover, the behaviour of the targets — their movements, their speeds, and even their “reflectiveness” (the amount of radar signal they reflect, professionally called Radar Cross Section or RCS) may seem similar.

Having said that, we managed to do it. Traditionally, radar operators and experts can in some cases distinguish between a human and an animal — some by listening to the signal, some by looking at the spectrogram. So, perhaps there are some hidden features in the data which weren’t in use in the classical RF(Radio Frequency) processing… That’s perfect for AI.

For this particular challenge, the radar data was given in its most raw representation — IQ samples (In-phase and Quadrature convention is a simple way to express the two orthogonal components, hence, sine and cosine, of an alternating signal) and tables containing some available meta-data.

The IQ samples can be interpreted as a Time-doppler segment: 32 consecutive pulses, each one contains 128 samples in time. The most common way to view this data is after a transformation called STFT (or a spectrogram). This transformation shows the magnitude of the received echos in each frequency and how it evolves through time.
The meta-data contains the label of course, but also auxiliary features — the estimated SNR (Signal-to-Noise Ratio) of the segment, the geolocation type or specific ID it was taken from (given as indices), the specific radar ID it was taken from and a special feature called “Doppler burst” — the estimated frequency bin in which the target’s center is located for each pulse in the segment (32 dopplers).

Here are some statistics:

In a well done webinar and a well documented colab notebook, the organizers described how the best practice preprocessing is done. They also gave an example of how to combine the Doppler-burst data in the model. In addition, the meta-data statistics was given for us to examine its distribution (hint — very unbalanced). And… That’s about it. Now you’re on your own.

Kick off

So, although we had a lot of prior experience with radar data — every challenge is unique. Right in the beginning, we were facing some important questions: How should we take advantage of the IQ data? What should we do with the “doppler burst”? What is the best way to deal with the data unbalanced distribution (upsampling? focal loss? if yes, how?)?

We realized this data demands careful treatment. Maybe we will need to try new approaches on top of those we already have. Therefore we splitted the team to do a quick research on major questions:

Should we deal with this signal as sound? If so, what are the best audio NN, how do they work, what data are they trained on and will it suit us?
Augmentations — the provided data is not that diverse and we have to ensure that our models generalize well. How to generalize this data and not hurt its nature? It was clear that the normal CV augmentations would not fit.
Ensembles — winning challenges is different than delivering an Edge AI product in real life (such as AI based radar). In order to produce a functional AI product which runs on the edge you need to optimize performance and computational resources — you have to run your model in almost real time, sometimes on an existing CPU even without GPU. Usually in challenges, and in this one particularly, you are free to use as much computational power you want. So we had to change our state-of-mind from a product oriented view to a challenge oriented view and sharpened our ensembling algorithms to raise our score.

After those questions were answered, we started the fun part — coding.

The main aspects that gave us the best results:

Data representation

We used many audio-like representations. As spectrogram is the most common, we also used STFT, mel-spectrogram, MFCC, spectral contrast, chroma and others. We believed that each representation would capture a different type of features and that assembling them all together would cover the whole data.
How should you treat the nature of complex signals — Real-imag? Abs-phase? Power-phase? dB(power)-phase? Why not all of the above? We introduced to the models the inputs in various representations to create different models for each representation.

Augmentations

Special augmentations for RF data — Here in axon-pulse, we have researched and developed special tools to generate augmented RF signals that outperforms normal CV augmentations. This is some of our “secret sauce” but here are some examples:

Example 1 — Random phase shift (because the initial phase of the signal is arbitrary. Phase shift of the whole signal should not change a thing)
Example 2 — Adding various amounts of noise: Thermal noise (white gaussian noise) and in this case using the auxiliary background noise in different amplitudes, phases, time-shifts and frequency flips. RF data is “Transparent” — each sample can contain data from many objects.
Manipulating the “doppler burst”. We figured that the doppler burst is not well defined. We suspected that small random changes in its value are plausible and can occur in real life measurements. By the way, this was essential because we identified that after all our augmentations the doppler-burst was left almost untouched, leading to a pattern that can be overfitted by the NN.

Augmented samples with different types of noise

Models — We insisted on trying many model architectures — architectures made for working on the 2D image-like signal, but also architectures that got as input the signal as 1D time series signal. We also used few of our custom Axon-Pulse self-made architectures for IQ signals that are tailor-made for RF data. This led to very different models focusing on different levels of features in the data.

Losses — Actually, nothing special here. Cross entropy classic loss was used almost all the way. In the end we integrated focal loss in some models but that didn’t make the difference to be honest.

Data balancing — this was a BIG issue for us. We realized that this was a key factor in this challenge. First, because the given data was unbalanced, and second, because we saw that many samples of the data are taken from the same track. Hence, they were very similar to each other. Putting them in the training set and in the valid set is a big no-no. We created a split that dodge that bullet by carefully picking segments for the validation and training set that do not share a common track ID.

Predictions cross correlation matrix of our 76 models

Ensembles — After training more than 75(!) models, most of them achieving AUC better than 0.9 (and the best reached 0.98), we had to improve their combined result. We assumed that after training on different representations, augmentations, models and splits, their errors would be uncorrelated enough to make a better prediction all in all. We tried xgboost and other famous algorithms — the best one was eventually xgboost (Surprise! Surprise!).

And… we crossed our fingers multiple times before hitting submit :)

Aftermath

This experience was awesome! We faced very difficult challenges and we are very proud of our team effort and solution. Just like any AI project, you cannot accurately predict (😉) what will give you the win. Sometimes, the things that seem farfetch give you an eureka moment, sometimes you laugh at yourself even thinking about it. Luckily for us, we have created such a team, knowhow and infrastructure that can handle these unique challenges successfully. We believe that our experience both in RF / radar signal processing and in AI / Deep Learning made a huge difference and enabled us to achieve this remarkable achievement.

We hope that this challenge will open the door to a new AI RF era.

We are Axon Pulse, a start-up from the Razor-Axon AI group, revolutionizing the radar industry with Deep learning.

MAFAT Radar Challenge: Solution by Axon Pulse

Radar target classification — is it human or an animal?

The Challenge

Kick off

Aftermath

Written by Axon Pulse