New Insights into Stock Price Movements Using Cyclicity Analysis

Vivek Kaushik
11 min readDec 24, 2021

--

Suppose we have a collection of time-series. We pose the following general questions.

  1. For any pair of time-series, can we determine a leader-follower relationship amongst their individual temporal patterns ?
  2. Can we determine the order in which the time-series undergo their individual temporal patterns ?

Sure, we could just plot all the time-series and do manual inspection to find the answers. But as with most real world cases, the data could be noisy.

Stock prices are notoriously noisy. If you don’t believe that, just observe the daily stock prices of the tech giants and major investment banks within the past decade !

Daily Historical Stock Prices

I’m sure seeing all that hurt your eyes ! What insights could we possibly extract from these stock prices when all they seemingly do is rapidly fluctuate without any context ?

We will use procedure in data pattern recognition known as Cyclicity Analysis to extract new, surprising leader follower insights based on the noisy stock price data. Given a large collection of noisy time-series, Cyclicity Analysis is a procedure that tests how well this collection of time-series fit what is called the chain of offsets model. The chain of offsets model asserts that there is an underlying continuous process that represents our collection of time-series up to scaling and small time delays (known as offsets). Equivalently, the model asserts that the time-series are scaled and shifted versions of each other. The procedure of Cyclicity Analysis involves determining pairwise leader follower relationship strengths and the ordering of the time-series based on when they undergo their temporal patterns.

Similarly related time-series analysis techniques include Dynamic Time Warping (DTW) and lagged correlation. While these solutions algorithmically determine pairwise leader follower relationships, they are not computationally efficient. They also do not determine the general order of the time-series as to when they undergo their own temporal patterns.

Step 1: Determining Pairwise Leader Follower Relationship Strengths

The first step of Cyclicity Analysis is to determine the leader-follower relationship strength of any two time-series.

Let’s understand how to do this in a simple case. Consider the two time-series plotted below.

Artificial Business Cycles

They both exhibit the same temporal pattern; they rise for a period of time and subsequently fall. For this reason, in macroeconomic terms, you may construe this temporal pattern as a business cycle. The key observation to make is that the orange time-series precedes the blue time-series relative to when they undergo their individual business cycles. The first step of Cyclicity Analysis is to mathematically recover this italicized observation.

Here is how we will mathematically determine the leader-follower relationship strength of any two time-series in general. Let {x_t} and {y_t} be two given time-series, both having the same number of observations. At each time t, we aggregate the corresponding 2 time-series values into a coordinate pair (x_t , y_t), where x_t is the first time-series value and y_t is the second time-series value at the time t.

Consider the polygon whose vertices are all such possible coordinate pairs. We can compute the area of this polygon, which requires a surprising application of Green’s Theorem from Multivariable Calculus. At each time t, we parameterize each boundary line segment L_t of the polygon. We parameterize each L_t so that we start at the vertex (x_t , y_t) and end at the vertex (x_{t+1} , y_{t+1}). Using this parameterization, we compute corresponding line integral

Line Integral

By Green’s Theorem, summing these line integrals over all times t gives the area of the polygon. See the details in this Jupyter Notebook on the complete derivation of an explicit formula for the oriented area.

Let’s go back to our artificial example. Let {x_t} be the blue time-series and {y_t} be the orange time-series. The corresponding polygon resulting from aggregating the data into coordinate pairs resembles a tilted ellipse.

A Tilted Ellipse

By Green’s Theorem, the area of this polygon turns out to be about -0.61289. Hmmm … but why did we get a negative number here ? Shouldn’t an area be a nonnegative quantity ? Well, the sign of the area depends on how we traverse the polygon. If we traverse the polygon in the counterclockwise direction, we will get a positive area. Otherwise, we will get a zero or negative area. For this reason, we call the area of the polygon resulting from our aforementioned parameterization procedure from the previous paragraph the oriented (signed) area.

Here is the key takeaway of the previous paragraph. The oriented area reveals the nature of the leader-follower relationship of the two time-series {x_t} and {y_t} in general. If the oriented area is positive, then we say {x_t} precedes {y_t}. If the oriented area is zero, then we say {x_t} is on phase (in sync with) {y_t}. If the oriented area is negative, the {x_t} follows {y_t}. Furthermore, the higher the oriented area is in absolute value, the higher the leader-follower relationship is in strength.

In our case with the 2 artificial business cycle time-series, we initially defined {x_t} to be the blue time-series and {y_t} to be the orange time-series. The oriented area we obtained was a negative number, so this means the blue time-series follows the orange time-series. This is indeed the actual case !

But we remark this fact. Had we instead defined {x_t} to be the orange time-series and {y_t} to be the blue time-series, the oriented area would flip sign i.e. it would now be +0.61289. In this case, this means the orange time-series precedes the blue time-series, which is again true !

Step 2: Determining the Cyclic Order of the Time-Series

The second step of Cyclicity Analysis is to recover the cyclic order of a given collection of N time-series, which is the exact order in which the time-series undergo their own individual temporal patterns.

As before, let’s understand how to do this in the case of artificial time-series undergoing business cycles. Consider the 5 time-series plotted below.

Artificial Business Cycle Time-Series

This is the key observation to make. The order in which the time-series undergo their respective business cycles is: purple, red, green, orange, and finally blue. This is what we mean by the cyclic order for these time-series. The second step of Cyclicity Analysis is to mathematically recover the italicized observation we made.

Here is how we will determine the cyclic order of a given collection of N different time-series. First, we enumerate these time-series in a specific order from 1 to N. We then construct the N x N lead-lag matrix

Lead-Lag Matrix

where the entry A_{j,k} is the corresponding oriented area for the j-th and k-th time-series (as described in Step 1).

We make a remark. The lead-lag matrix is always skew-symmetric, which means that A^T =-A, where A^T denotes the matrix transpose of A. This is intuitively clear. Any time-series is always on phase with itself, so this means each principal diagonal entry of A, namely A_{j,j}, is zero. Note if a time-series precedes another time-series, that is equivalent to the latter time-series following the former time-series. This means A_{j,k}=-A_{k,j}.

Now, we perform spectral analysis on the lead-lag matrix A. We seek the largest eigenvalue (in modulus) of our lead-lag matrix A and its corresponding eigenvector, which we call the dominant eigenvector.

There is a marvelous theorem based on low-rank approximation arguments that the cyclic order is precisely the order of indices corresponding to the components of the dominant eigenvector sorted by their principal arguments in ascending order.

Let’s go back to our example with the 5 artificial business cycle time-series. Enumerate the time-series in the order of 1: blue, 2: orange, 3: green, 4: red, and 5: purple. We plot the corresponding heatmap representation of the 5 X 5 lead-lag matrix.

We plot the eigenvalue moduli of the lead-lag matrix (sorted in descending order). We also plot each component of the dominant eigenvector as (x,y) coordinate pairs (labeling the i-th component as i).

Business Cycle Lead-Lag Matrix Eigenvalue Moduli and Dominant Eigenvector Component Plots

We can sort the dominant eigenvector components in ascending order via their principal arguments, which will give the cyclic order. In this case, notice that 5,4,3,2,1 is the cyclic order of our 5 time-series. According to our enumeration scheme, this corresponds to purple, red, green, orange, and blue. This is indeed the case.

Another remark. Had we enumerated the 5 business cycles in another way and repeated this same process of constructing the lead-lag matrix and performing the spectral analysis, the resulting cyclic order obtained will always correspond to purple, red, green, orange, and blue. So the cyclic order is independent of enumeration !

Cyclicity Analysis on Daily Stock Prices

Now comes the fun part. Let’s see what we get when we run Cyclicity Analysis on daily stock prices. For our purposes, let’s consider some of the tech giants and some of the major investment banks.

Pandemic Analysis

Let’s look at daily prices between 2019 and 2020. After some customary preprocessing steps, which include taking logarithms, detrending, and normalizing each time-series, here is what we get:

Stock Prices Just Before/During Pandemic

Of course, stock prices dropped significantly just after February 2020, which is when the pandemic started. We obtain the lead-lag matrix and the cyclic order of these stock price time-series. After reshuffling the columns of the lead-lag matrix according to this cyclic order, we obtain the following heatmap representation:

Pandemic Lead-Lag Matrix Reshuffled

The heatmap shows that Meta’s row (FB) is full of bright red colors, which correspond to the strongest leader-follower relationships. In particular, the data is suggesting many investment banks’ stock price behaviors have been closely following Meta’s own behavior within the past 2 years.

Let’s see some plots of these claimed leader-follower relationships. In each case, we first plot Meta’s stock prices with a chosen investment bank’s stock prices. Then, we plot their accumulated oriented area overtime, which is a plot of their joint leader-follower relationship strength development overtime.

Meta (FB) leading over J.P. Morgan (JPM)
Meta (FB) leading over U.S. Bancorp (USB)
Meta (FB) leading over Wells Fargo (WFC)
Meta (FB) leading over Aflac (AFL)

In all cases, we can see Meta’s stock price plummeted during the pandemic, and the investment banks’ stock prices followed suit. Furthermore, the joint leader follower relationship strength sharply rose in sync with the investment banks’ stock prices dropping.

2011 Debt Ceiling Crisis Analysis

Let’s repeat this analysis on a market collapse that occurred way back in mid-2011 that was due to the debt-ceiling crisis. The debt-ceiling crisis occurred as a result of an ongoing debate within Congress regarding the maximum amount of borrowing the federal government should be allowed to undertake. This impacted the entire financial sector, so the investment banks’ stock prices plummeted during the crisis.

Debt Ceiling Stock Prices

Here is the resulting lead-lag matrix.

Debt-Ceiling Lead-Lag Matrix Reshuffled

Notice there is a wall of red column in the last column of the matrix. The data is suggesting that Netflix (NFLX) has been following the behavior of many investment banks. That’s a very interesting result, since Netflix and the banks belong to very different sectors of the market !

Let’s actually see several of these leader-follower relationships. As before, in each case, we have plotted the leader (an investment bank) and the follower (Netflix). We have also plotted the accumulated oriented area overtime to show the leader follower relationship strength development throughout the decade.

J.P. Morgan (JPM) over Netflix (NFLX)
Deutsche Bank (DB) over Netflix (NFLX)
Aflac (AFL) over Netflix (NFLX)
Barclays (BCS) over Netflix (NFLX)

In all of these cases, the investment bank stock prices dropped precipitously in mid 2011. But after a couple of months, Netflix’s stock price dropped precipitously as well. This was overall reflected in accumulated oriented area plots, which sharply rose just as Netflix’s stock price plummeted.

So What’s Next ?

As we mentioned, the accumulated oriented area plots reveal insights into real-time the leader-follower relationship development between two time-series. As we saw with stock prices, while the accumulated oriented area plots were jittery due to the data being noisy, we still noticed those sharp, monotonically increasing trends during the market collapses. Instead of manually inspecting these plots as we did, we would like to capture such trends using anomaly detection techniques. Furthermore, we would like to forecast the behaviors of the two time-series by forecasting their accumulated oriented area time-series.

We are leveraging Cyclicity Analysis to determine the real-time leader follower relationships of both stock prices and cryptocurrency prices during a particular trading day. The stock market is open on weekdays 9:00 A.M — 4:00 P.M. EST, while the crypto market is open 24/7. In both cases, we can update the lead-lag matrix and accumulated oriented area plots per minute, which can help investors make quick real-time decisions to buy or sell stocks/cryptos.

Sample Intraday Crypto Prices
Sample Intraday Crypto Lead-Lag Matrix

In addition, we are looking at the 2021 daily statistics behind the dominating leader-follower pairs of intraday stock/crypto prices. For example, here is a frequency count matrix, which is constructed by counting the total number of 2021 days in which crypto j’s intraday price time-series strongly preceded crypto k’s intraday price time-series.

2021 Crypto Intraday Leader Follower Frequency Matrix

The data is suggesting Monero (XMR) has been consistently following many of the other cryptocurrencies throughout the year. Here are some example days of Monero (XMR) following Solana (SOL):

Sample 2021 Days in Which Monero (XMR) strongly followed Solana (SOL)

See our current progress in this Jupyter Notebook.

References

[1] Robert Ghrist’s Animated Lecture on Time-Series and Stokes Theorem: https://www.youtube.com/watch?v=YXSI0etVYU8

[2] Cyclicity Analysis Github Repository: https://github.com/vskaush2/CyclicityAnalysis

[3] Stock Market Analysis Github Repository: https://github.com/vskaush2/StockMarketAnalysis

--

--