A really friendly guide to use of Wavelet Theory in Machine Learning (Part 1)

Kaustav Tamuly
Intel Student Ambassadors
5 min readFeb 13, 2019

--

The first part of this series of blog posts will cover the basics of Fourier transform and Wavelets.

The Wavelet Transform is a very powerful time-series analysis tool that isn’t very popular among the Data Science community. We are already familiar with other signal processing tools like Fourier Transform which are very intuitive albeit a bit mathematical. These transformations have changed our perspective from focusing on the signal to being appreciative of the underlying building blocks.

It is expected of the reader to be familiar with the fundamentals of Fourier Transform. What the Fourier transform essentially does is transform a signal from its time-domain to frequency domain. Fourier transform works very well only when the signal in context is stationery i.e. the frequencies present in the signal are not time-dependent. This is not desirable as most of the signals we encounter in real life are time-dependent like Stock data, Sensor data, etc. Wavelets on the other hand are very good with these dynamic signals which are localized in both space and time.

Here’s a short description on how the fourier transform works: We multiply (dot product) the signal with a series of sine-waves with different frequencies. If the peak observed is high, this means that there is an overlap between these two signals and the selected frequency is observed in that signal.

We can see that regardless of how the signal is localized in time, the peaks obtained from Fourier transform focuses only on how the signals are localized in space. It cannot tell us about the location of these peak frequencies which is why these two signals are same to the transform. Short-term Fourier transform attempts to solve this problem by breaking the signal into shorter segments of equal length and computing the Fourier transform of each shorter segment.

25 ms window
125 ms window
375 ms window
1000 ms window

However the problem with this approach is that STFT has a fixed resolution (runs into theoretical limits in accordance with the uncertainty principle). The smaller we make the window, more precisely we can identify the time at which the frequencies are present but the precise frequencies becomes difficult to identify. Upon increasing the window size, we can identify the precise frequencies but the time between frequency changes becomes blurry.

This is where Wavelets come in. Wavelets are a better way of analyzing these dynamic signals because they have a relatively higher resolution in both time and frequency domain. Wavelet Transform tells us about the frequencies present as well as the time in which these frequencies were observed. This is done by working with different scales. First we work on the signal with larger windows and try and understand the larger features and then we move on to identifying the smaller features using a smaller window. The wavelet transform, for small values of frequency, has a high resolution in the frequency domain but less resolution in the time domain. On the other hand, it has large resolution in time domain for large frequencies but less in frequency domain. It is very important to understand this trade-off because this is the only reason wavelets are preferred over short-term Fourier transform.

So what are these wavelets? How do they look? Wavelets are ‘mini waves’ that have a short ‘burst’ and die away quickly, unlike sin() or cos() wave that goes on forever.

The fact that these wavelets are localized in time gives us an advantage over infinite waves as it gives us a better resolution in time domain. Instead of trying to model our signal with an infinite wave, like in Fourier transform, we are modeling here with a finite wave which is being slid across the time domain. This process of sliding is also called as convolution. After we have done this with the mother wavelet, we can scale it to accommodate lower and higher frequencies. This is essentially done by stretching or squishing our wave. There are hundreds of different wavelets, which is why there are hundreds of different transforms in hundreds of different domains. However each domain has scale in their x-axis, which can be easily converted to frequency domain by using Pywavelets function scale2frequency.

Below are the different families of wavelets with each of them having a different shape, smoothness and compactness.

We can design a new wavelet provided it follows the following two conditions: it has finite energy in time and frequency and it has zero mean. It should be also noted that there are two types of wavelets: Discrete and Continuous depending upon the scale and translation factors.

Hope this served as a healthy introduction to the concept of wavelets. I’ve tried to sum up as much fundamentals related to the subject as I could have without diving deep into the mathematics. I encourage the reader to go through various other resources to learn more about wavelets. The next part of this blog post will focus on visualization and classification of signals using continuous wavelets and Convolutional Neural Networks.

--

--

Kaustav Tamuly
Intel Student Ambassadors

Microsoft Research | Intel Early Innovator Grant Recipient | BITS-Pilani