Exploring Cycles in data.
A short overview
We are full of cycles.
Cycles are part of life, nature and perhaps some data you might encounter, by cycles we mean that events repeat themselves in time and space with certain periodicity.
If you live on planet earth, you experience the day/night cycle every day, it gets dark and colder roughly one third of the day, then light and warm the rest of it, this series of events repeat themselves during a period we call a day.
Seasons are another type of cycle, it is cold for a number of days, then its warmer, and this repeats itself over a longer period of time, life and death are yet another example of a cycle, but here the time scale is so large that we usually forget or don’t notice we are part of a greater cycle.
Why study cycles?
By studying cycles, we can then detect them and adapt our behavior to exploit or avoid a certain phase of said cycle, for example if we know that temperatures will be cold and food scarce 6 months from now, we can prepare accordingly.
As mentioned earlier, cycles are everywhere, we biological beings are hardcoded to some aspects of them (days and season), and create our own cycles (sleep/wake, fertility cycles, work/play, etc, etc.), yet the utility of knowing how to identify and describe them extends to other domains…
Consider the problem of cycles in the financial markets and when to invest in them, here the cycles are influenced by known and unknown factors, yet if you want to be a successful investor, you need to be aware of where you are in the cycle, or as one prominent investor puts it:
"Being too far ahead of your time is indistinguishable from being wrong."
- Howard Marks ( Oaktree capital )
A cycle in detail.
For starters, let’s look at the simplest of cycles:
And some relevant data points:
| X | 0 | 4 | 8 | 12 | 16 | 20 | 24 | 28 | 32 |
| Y | -4 | 0 | 4 | 0 | -4 | 0 | 4 | 0 | -4 |
Note that the values
4repeat themselves over the non-repeating axis
0…32, what we have here are
2cycles that start and end at
(0,-4)with a length of
Here’s another cycle found all over nature (science and engineering) and is usually referred to as a sine wave :
Sine waves deserve their own separate discussion, for now just note that they provide us with additional ways to talk about cycles and describe their parts.
But more often than not you will encounter cycles in the raw like these ones:
The axes are left out on purpose so you can hopefully note that there are 2 large full cycles and an incomplete 3rd one, you can identify the first 2 by their peaks and troughs, the 3rd one is longer in length and hasn't peaked yet...
After noticing these features, we can then reveal the mystery data as the Dow Jones Industrial Stock Average (DJIA) from May 1997 to May 2019 (~ 22 Years), these cycles represent the financial ups and downs of millions of people on planet earth during those years.
Visually detecting cycles on a chart representing your data is a perfectly valid way to figure out this cycle business, unfortunately it lacks some refinement, we could ask specific metrics about our cycles and then we would be left gesturing at a chart...this cycle is about hmmm 2 thumbs wide!
Fortunately, smart people have been tackling cycles in a structured and mathematical fashion, so we can take advantage of that.
I'll explore a common and popular algorithm for cycle detection (Floyd's Tortoise and Hare) but there are a few more if you want to explore them at your own pace, here's a good place to start:
Floyd’s Tortoise and Hare
We start with a number sequence (here the cycle is obvious) and place both the tortoise and the hare on the same starting point.
Like in the fable, the Hare is fast and the Tortoise slow, the Hare moves in two
2 space increments and the Tortoise just one
1 at a time.
At this rate if there is a cycle, both the Tortoise and the Hare will meet on the same value
0 thus revealing the cycle
0,4,8,4,0 ,simple and elegant, but…
Notes:(1) This is a very naive explanation of the algorithm (for the sake of clarity), in reality we need to deal with nodes and pointers and also implement the algorithm in your language of choice, a good starting point is python, you will need to learn and implement linked lists, after that you can add complexity, here's a few implementations: Rossetta Code: cycle Detection.
Notes:(2) It might not be obvious, but you can now get cycle metrics, once you have a cycle, you can get the min/max (trough/peak ...0,8) and calculate amplitude, things like frequency and period are also possible once you incorporate pointers (the X axis, which in this example we are omitting but assume the data is continuous like a time series).
Notes:(3) This problem/algorithm has multiple practical applications, a favorite of coding interviews, it also helps detect infinite loops and cryptographic collisions amongst other uses.
The world of cycles is vast, depending on your specific needs and project it might be convenient to create your own research path or analysis, that’s not to say that there are more advanced ways to look at cycles in data and corresponding techniques and tools, here are a few rabbit holes you might want to consider…
Fourier Analysis : If it’s a natural phenomenon (and what isn’t) chances are it has a frequency (see illustration on sine waves), Fourier analysis involves breaking down or extracting those frequencies and finding functions that recreate them, (once more a gross simplification), the idea being that by reversing the process you can generate a new time series with your own variables.
Forecasting is a heavy subject
(check the notes below), once you have figured out that your data has a cyclical component and are done quantifying it, you would also need to figure out if and when it will repeat itself, will it follow a trend? up, down, sideways? what is driving the cycle and what makes you so sure it will repeat itself forever?
These questions require not only some knowledge about cycles and the math & algorithms to recreate them in the future, but more importantly also knowledge of the subject you are forecasting to gain insights about future behavior and the underlying reasons that drive it.
For instance, a cycle in biology will sooner or later come to an abrupt stop when the organism dies; a cyclical seasonal trend (think holiday sales) can be disrupted by new technology or like we will see an external factor can also affect a cycle, context here is king, let’s say you encounter the following unlabeled data/chart…
Without context we can make the reasonable observation that there are cycles and we can forecast the next one quite comfortably, here’s what actually happened along with the missing context…
With the context restored, actual data and the previous observations, we can now realize that the current cycle is not behaving in a normal or expected way, we can then look for possible causes and gain clarity.
Notes: A few sources related to forecasting (especially time series) and cycles:
- Forecasting: Principles and Practice An excellent introduction.
- Prophet (Facebooks forecasting library), check the white paper for a related discussion on forecasting, also a pretty cutting edge tool.
- Time Series in Python- Part 2. Dealing with seasonal data: An excellent series on forecasting and extracting cyclical elements.
As an introduction and overview of cycles I hope this serves as a starting point; if you have any comments or want to suggest additions or subtractions, let me know.
About the Author:
Born Eugenio Noyola Leon (Keno) I am a Designer, Software Developer & Artist currently living in Mexico City, you can find me at www.k3no.com