What Is Phase Analytics?

Sean McPherson
Phase Analytics
Published in
3 min readSep 13, 2017

Phase Analytics considers data science from the perspective of digital signal processing. Digital signal processing, or DSP, is a field of electrical engineering that focuses on mathematical operations and transforms, i.e., “processing”, of digital signals. The most relevant digital signals for data science include audio and speech, as well as images and video. However, DSP applies just as well to any time series. The mathematical operations that DSP focuses on include basis transformations such as the Fourier transform and the Discrete Cosine transform used in image and video processing; filtering operations, including adaptive filters and ARMA methods common to time series forecasting; and encoding or compression techniques used in JPEG images or even zipped files. Still not sure what DSP is used for? Consider this, many of the technologies you use everyday: listening to MP3s, streaming video (e.g., MPEG4), even the speech processing used in that phone call you just made to your parents, all of these utilize digital signal processing.

Why is digital signal processing important for data science? Essentially data science involves three components:

  • Mathematics — statistics and probability theory
  • Algorithms — optimization methods and machine learning
  • Data — samples from probability distributions

One of the wonderful things about data science and data scientists is that these components involve a varied background, and each component learned in different courses of study.

The mathematics realm is learned, naturally, from general courses in maths, e.g., calculus, probability, and statistics, but also from specific fields such as physics, and finance. Algorithms, in particular those used in data science, are generally the realm of computer scientists, however, many of the more common methods, i.e., linear regression, are found in other fields.

Finally, data, as in collecting and analyzing data sets, is the area of data science where it is crucial to understand the question being explored, and how to gather data to answer this question. Because of this need to understand the subject matter, the realm of data has benefitted from those with backgrounds in social and political science, psychology, and economics.

These skills remain very important to data gathering, however, and this is largely due to the emergence of deep learning, an understanding of digital signal processing is becoming crucial for data scientists as well.

Why is digital signal processing important for deep learning? Traditional machine learning algorithms operate on vectors of features. These feature vectors are crafted by feature engineers, the social scientists who understand what features are useful to answer the question at hand. Deep learning networks in part remove the feature engineering step as they learn relevant features through training. The input to deep learning models are no longer crafted feature vectors, but vectors of raw input data in the form of digital signals, vectors of audio waveforms, tensors of images and videos.

Thus, in the new era of deep learning, the form of the input data are digital signals, which is the realm of digital signal processing. Achieving optimal performance with deep learning models will require a solid understanding of digital signal processing concepts, and techniques.

What is the goal of Phase Analytics? The purpose of Phase Analytics is to i) provide data scientists an understanding of some essential concepts in digital signal processing, and ii) explore how advanced techniques from digital signal processing can benefit data science, and in particular deep learning. These areas will be explored in the coming months with blog posts and code submissions to our Github repository.

--

--