Published in

DRILL

BITCOIN case study: applying basic Digital Signal Processing into financial data

A brief analysis of the knowledge that we can extract from financial historic data using simple DSP concepts.

Financial analysis and forecasting are one of the most captivating challenges for data scientists and engineers. Even when it may be impossible to make completely accurate and real predictions for prices, certain success rates can be obtained in tendency forecasting and other indicators. Using this information, long-term strategies can certainly work, and many intra-day diversified wallets are also based on big data and machine learning algorithms nowadays. Sebastian Heinz and Alexandr Honchar are good examples of these practices at Medium. [1][2]

On this article, I will carry out some simple DSP techniques over financial data in order to obtain valuable information from the historic values. This information can be used afterward for feeding deep learning systems, but that is something we will discuss in future articles. This article will show just an introduction to the DSP capabilities on this market.

The basis of signal processing is that every time-variant signal can be decomposed into a -huge- number of sinusoids with different frequencies. It is a useful fact that allows us to analyze the amplitude variations in terms of their different periodicity, extracting or isolating interesting components.
For this example, I will be working with Bitcoin (BTC) close values dated up to March 2018. The dataset is available for download from Kaggle. [3]

The following graph shows a simple plot of close values vs time, with 1 minute period. I will analyze just the latest 230 days.

BTC spectrum analysis

The first thing to check is the frequency spectrum of this signal. If there is any clear repetition of price variations, the frequency component related to this period should be stronger.

We can simply apply the Discrete Fourier Transform with an FFT algorithm [4], and plot the magnitudes obtained. However, there is not much to see: the result doesn’t have any clear ‘tone’ and presents a strong 1/f component. [5]

In fact, we can consider this kind of financial data something very similar to a Brown Noise distribution [6], with power density inversely proportional to its frequency squared. We can check a time graph for Brown Noise and its frequency spectrum. The resemblance with any price chart is clear, with the difference of the DC component.

So, once we realize the BTC spectrum does not provide much information by itself, we can process the frequency data to see the variations on the time-domain. I will start with the most common and necessary processing for radio-frequency and audio signals: filters. [7]

If we apply a high-pass filter, using 1 cycle/year as a cut-off frequency, we can eliminate the slow variations of the price from the graph. This means that only the variations of price that occurred faster than 1 cycle per year will be displayed. We can identify the periods of lower volatility when the filtered price moves closer to 0.

We can even use a higher cut-off frequency. In this case, the information can be used for fast operations or short-term investing.

Another option is plotting a spectrogram, using a frequency-time graph. The Z axis is the amplitude of the spectrum, represented with a color code. We can identify again the fluctuations in volatility by checking the periods of time in which the spectral power density is distant from zero. We can notice in the following figure that the volatility is higher around the days 200–220 of our dataset, corresponding to the end of 2017.

We can also try the inverse strategy: instead of isolating the high-speed variations from the current price, we can use a low-pass filter to eliminate the high-speed changes. The result is a smoothed curve, in which we can easily identify the tendency of the local values. The lower the cut-off frequency, the lower will be the presence of short-time variations.

Of course, there are other techniques to smooth the signal based on the time-domain as we will see in the following section.

BTC time samples analysis

The moving mean is one of the most common indicators for historic prices [8]. It consists of calculating the average value of a limited number of time samples around one point, and ‘move’ the time window along the whole dataset. Crossing points between curves with different window sizes indicate tendency changes. However, it is clearly a delayed indicator that provides late information by itself.

The same process can be applied calculating the moving median.

In addition, regression techniques [9] can display the tendency of certain intervals by calculating the dependencies between the X and Y axis, it is the time and price of the period of interest. Regression is a typical tool for time-series forecasting, and linear regression is an easy way to obtain the approximated slope of data distributions.

However, for data with high volatility as in the BTC case, it may be more appropriate to use quadratic regression to fit better the curve. In the following figure, we can compare the curves for both methods using the same window size. Note that regression calculation is usually much slower than using the mean or median approaches.

There are many other types of regression and moving-window calculations. For example, weighted and cumulative averages, logarithmic and exponential regression, Gaussian filter or Savitzky-Golay regression filter. Depending on the purpose and the type of data, every one of them will provide different results.

Conclusions

Simple data filtering, smoothing and visualization techniques can give us a broader perspective of the behavior of the price along time. Processed data can be used indeed for feeding deep learning algorithms, in order to make it easier for the system to learn from the correct information.

For example, for long-term investing it may be useful to smooth the dataset to focus on the tendency. However, removing the DC component and feeding the algorithms with high-frequency information may be better for short-term understanding.

Of course, there is much more than these simple DSP techniques. Financial data pre-processing and analysis is a complex field, and the profitableness of the results will depend on the investing strategy.

References

[1] S. Heinz, “A simple deep learning model for stock price prediction using TensorFlow”, Medium, 2017. [Online]. Available: https://medium.com/mlreview/a-simple-deep-learning-model-for-stock-price-prediction-using-tensorflow-30505541d877. [Accessed: 10- Jun- 2018].

[2] A. Honchar, “Neural networks for algorithmic trading. Simple time series forecasting”, Medium, 2016. [Online]. Available: https://medium.com/machine-learning-world/neural-networks-for-algorithmic-trading-part-one-simple-time-series-forecasting-f992daa1045a. [Accessed: 10- Jun- 2018].

[3] Zielak, “Bitcoin Historical Data | Kaggle”, Kaggle.com, 2018. [Online]. Available: https://www.kaggle.com/mczielinski/bitcoin-historical-data. [Accessed: 10- Jun- 2018].

[4] E. Weisstein, “Fast Fourier Transform — from Wolfram MathWorld”, Mathworld.wolfram.com. [Online]. Available: http://mathworld.wolfram.com/FastFourierTransform.html. [Accessed: 10- Jun- 2018].

[5] R. Kiely, “Understanding and Eliminating 1/f Noise | Analog Devices”, Analog.com, 2017. [Online]. Available: http://www.analog.com/en/analog-dialogue/articles/understanding-and-eliminating-1-f-noise.html. [Accessed: 10- Jun- 2018].

[6] J. Castro, “What Is Brown Noise?”, Live Science, 2013. [Online]. Available: https://www.livescience.com/38547-what-is-brown-noise.html. [Accessed: 10- Jun- 2018].

[7] I. Poole, “RF Filter Basics Tutorial :: Radio-Electronics.Com”, Radio-electronics.com. [Online]. Available: http://www.radio-electronics.com/info/rf-technology-design/rf-filters/rf-filter-basics-tutorial.php. [Accessed: 10- Jun- 2018].

[8] E. Picardo, “Moving Average — MA”, Investopedia. [Online]. Available: https://www.investopedia.com/terms/m/movingaverage.asp. [Accessed: 10- Jun- 2018].

[9] ”Regression”, Investopedia. [Online]. Available: https://www.investopedia.com/terms/r/regression.asp. [Accessed: 10- Jun- 2018].

--

--

Telmo Subira Rodriguez

1K Followers

Studying AI for fun. Electronics & Telecommunication Systems engineer. Science-fiction lover. Passionate about technology, good design, and innovation.!