How Similar is the Crypto-Bubble to the Dot-Com Bubble?

An analytical comparison between two waves of irrational exuberance.

In the era of bitcoin and blockchain, the late bandwagoners of the dot-com bubble believe they’ve been gifted a second chance at striking rich. As a result, cryptocurrencies have been subject to the same variety of rampant, “irrational exuberance” described by Alan Greenspan that plagued the dot-com era. In the past year, cryptocurrencies have given new life to this brand of speculative behavior, led by the notorious alpha currency: Bitcoin.

As a result, society has split into two personalities: those who are infatuated with Bitcoin and a decentralized future, and those who are tired of hearing about it. The topic has become polarizing. For every influencer that endorses the technology, there’s an equally reputable voice shouting bullshit. Everyday, the list of people marketing themselves as “crypto-experts” and “full-time cryptocurrency traders” doubles. One particularly disconcerting trend is the influx of crypto-investment advice columns. The few good ones recommend that you invest responsibly, understand the businesses you’re buying into, and be disciplined in your approach. The bad ones sound like click bait (e.g., “how to predict the next Ripple, Ethereum, and Bitcoin,” “Cryptocurrencies you need to invest in right now,” etc.).

While the industry is mobilizing, the people closest to the technology have had a very difficult time predicting its trajectory. Most of the predictions I’ve seen are not rooted in data or deep analysis, but rather feeling and atmosphere. I’m not saying I have the right answers, but I do believe that there’s a better way to predict the trajectory of Bitcoin and other cryptos. This may even help validate whether the technology will have a future at all. One potential approach is to look at the dot-com bubble for precedent. Any analysis that starts here ignites a few questions:

  • Are the two bubbles actually similar?
  • If so, can we venture to guess the end of the bitcoin bubble?
  • And can we then predict how long it’ll take for the dust to settle after it bursts?

These are the questions that I’m hoping to answer using rigorous, analytical techniques. To accommodate the various levels of commitment this article may receive, I’ve included both a TL;DR version, and a lengthier analysis.

Short Version:

All great analyses start with a hypothesis, so here’s mine: if the dotcom bubble and the cryptobubble are similar, then we can predict the future of the cryptobubble using the dotcom bubble. This article focuses on the first half of of this statement: proving that the bubbles are similar.

I applied a popular technique used in pattern recognition called Dynamic Time Warping (DTW) to verify the claim that the crypto-bubble bears semblance to the dot-com bubble of the late 90s/early 2000s. For data, I leveraged two time series that I believe represent the growth and trend of both markets over time: 1) an aggregate of 600+ internet companies’ stock prices throughout the dot-com bubble and 2) scraped bitcoin prices starting from 2013. After several data collection, cleansing, scaling exercises, I used DTW to map data points strategically, thereby minimizing distance between the series. Ultimately, I plotted this path against the best case, perfect alignment linear path. The results indicate that the relationship between the bubbles is significant! There are three valuable statistics that I used to asses the fit of the regression model: R², overall F-test, and the standard error of regression. The R² value of the described path against the regression is 0.92, while the standard error of regression is around 16.8 (or about a 10% difference). So the data fits the ideal case pretty closely. Finally, I ran an F-Test, and was able to conclude that the regression’s fit is significant (F-Statistic = 3,720.68; p = 0.0000). All that is to say, the parallels between the dotcom bubble and the crypto-bubble are validated.

Now that I’ve verified that the bubbles are similar, in a subsequent piece I will use an econometric technique called Granger Causality to see if dot-com bubble data can be used to predict the growth of the crypto-bubble, and potentially its end.

Long Version:

Before beginning, it is important to recognize that the size of the bubbles are very different. At its peak, the NASDAQ market capitalization during the dot-com bubble was approximately $6.7 trillion, whereas the cryptocurrency bubble is just around $800 billion at the moment, or about 1/8th of the size. Thus, I’m not seeking to understand if they’re comparable in magnitude, but rather how they compare in growth. To achieve this, I isolated the growth of internet companies specifically during this time, and compare their trends to the rise of bitcoin.

First off, all my source code is on Github and can be found here. You can find explanations of the assumptions I made in the eponymous section near the bottom of this paper; however, the basis of this analysis is centered around my core assumption is that a company’s share price is fairly comparable to the price of a cryptocurrency. Both quantities are bought and sold in a speculative manner, prone to similar fluctuations, and similar in magnitude. The future of equity could also be distributed and transacted in cryptocurrencies, which further solidifies the analogy.

This analysis starts the way many others do: gathering data. I compiled a list of the 600 largest public internet companies between 1980 and 2017. This list can be found on the linked Github page. I then mined daily share price data on all of those companies from then to now from Yahoo Finance using Pandas Datareader. I was able to pull daily high, low, opening, and closing prices for each company. For the sake of this analysis, I used the daily closing price, as it is commonly used by traders in their assessments. Afterwards, I truncated the data to span from August 5th, 1997 to March 10th, 2000.

For each day, I took the average of share prices for the 600 companies I mentioned above (of course, not all 600 companies were around or in business for the whole duration). It’s important to note that I removed the dates that the stock market doesn’t trade on (i.e., weekends and public holidays).

The crypto-half of this data gathering stage was more straight forward. Since bitcoin owns the lions share of the market, I believe it’s value is a strong enough indication of the whole market. For further justification of this decision, check out the Assumptions section. I used the Kraken exchange’s API to compile bitcoin’s historical prices by day from September 9th, 2013 to September 9th, 2017.

Here’s where I paused. There are several inherent challenges across both data sets, and I performed a few different data manipulations to address them:

  • Lots of Noise: Stock price data contains a lot of noise and is highly prone to fluctuation. Bitcoin prices are no different. To address the rapid fluctuations of the stock prices, I took five-day (weekly) averages. I did the same for Bitcoin prices, but took seven-day averages instead since BTC can be traded 24/7.
  • Different Scales: Moreover, both signals are not comparable in magnitude. Simply put, the dot-com bubble was a lot bigger than the bitcoin bubble (for now). It’s possible that comparing market capitalizations might be better than share/coin prices, but it’s a little tricky to easily determine the market cap of the dot-com bubble over time. To address the issues of scaling, I normalized each of the datasets, so that all values are relative to each other and range from 0 and 1.
  • Uneven Timelines: The last issue is that the time series are uneven. Check out the assumptions tab for why I chose the start / end dates that I did. Typically, the solution for uneven time series is to use interpolation to fill in the interstitial gaps, but our data sets are sampled uniformly at the same rate. My eventual choice to use Dynamic Time Warping was influenced by the need to address this challenge. I’ll explain exactly why later in this paper.
  • Misc. Observations: Both data sets are non-seasonal. I’m not sure it’s worth mentioning outliers, since we’re investigating speculative bubbles and unreasonable outcomes are what we’re trying to understand.

After this cleansing exercise, I plotted the normalized, smoothened prices against time below. At first glance, the results are promising:

Now that the data is about where we need it to be, let’s move on to the fun part: comparing the data series. Dynamic Time Warping (DTW) is a popular technique for analyzing time series, especially for those that vary in cadence and length.

In essence, DTW is a technique used to strategically map points in time series and find the minimum aggregate Euclidean distance between them. It does so, by creating a matrix comprised of the distances or “costs” between all the points in a time series. We then traverse the matrix, starting from the matrix cell at the beginning of both series, and select the points that have the least distance between each other. This is how the DTW pairs time series values to minimize the “cost” or distances between points. The visual below demonstrates this exact procedure:

I could spend more time explaining this approach, but this is a well-documented, accepted technique for pattern recognition. To really understand the technique, you can check out the following resources:

  • Youtube: The voice is a bit annoying, but contains a good high level explanation and walks through an example
  • DTWs in Finance: Interesting paper on a few ways DTWs can be effectively used in finance (I use a similar approach in this analysis).

For the purposes of this analysis, I used a “Fast DTW” Python package. Small digression: for off-the-shelf algorithms, running a DTW has O(n²) time and memory complexities, which isn’t great. There are a few algorithmic tricks that can bring the space and amortized time complexity down to O(n). The library I used leverages these tricks, which is why it’s called “Fast DTW.”

Running a DTW on the normalized, smoothened time series, I was able to obtain the distance between the time series and the lowest cost path.

Now that we have our path and Euclidean distance outlined, what do we do with it? How can we use these results to draw a conclusion about the data? My first instinct was to use the distance computed. Since the data was first normalized, our goal is to contextualize the distance between the time series. If the distance is very small, I can comfortably say that the series are similar (there’s more statistically rigorous language I could use here, but I still want to keep this relatively digestible for most). If not, I can go ahead and abandon this quest. But how do we set thresholds for evaluating that quantity? Precedent? I didn’t find much in the way of similar analyses using the same approach. Qualitatively? After digesting a lot of information on both bubbles, I couldn’t ascertain any historic or contextual indicators that I could use here to set an apt threshold.

This left the algorithm’s second output: path. It took awhile for me to land on an approach, but I think I’ve developed a sound one. To understand it, think about the edge cases in this analysis. If two time series are identical, the path would be an exactly linear line with 1–1 mappings. If the series are very different, the path would look cacophonous. With this in mind, I drew a linear line between the starting and ending coordinates of the path (slope = 1.60). This line represents an ideal outcome and perfect coordination between the time series. I then compared the DTW path to that trend line by computing the R² value, an F-test, and the standard error of regression. The plot below summarizes this comparison.

The results, seen above suggest that the similarity between the two bubbles is significant! The R² value of the described path against the regression is 0.92, while the standard error of regression falls around 16.8 (or about a 10% difference). So the data fits the ideal case pretty closely. Finally, I ran an F-Test, and was able to conclude that the regression’s fit is significant (F-Statistic = 3,720.68; p = 0.0000). All that is to say, the parallels between the dotcom bubble and the crypto-bubble are validated. I can now use this assumption as a foundation for any subsequent comparative analyses between the two bubbles.

Coming Up: Predictions

Now that we’ve established that both bubbles have similar trends, can we use one to predict the other? Looking at the plot below, my gut says that we might be able to. In a coming article, I’ll explore the use of an econometric test, called Granger Causality to see if I can make sound predictions about the crypto-bubble using dot-com bubble data. Reminder, the plot below are normalized. Since the metrics are not equivalent, normalizing them allows us to make more equivalent .

Assumptions:

  • Initially, I wanted to determine start and pre-crash dates of the dot-com bubble analytically. While the latter date can be determined with ease, defining the beginning of speculative investment in internet companies could involve a pretty extensive quantitative analysis. In reality, in makes just as much sense to look into why the speculative investment occurred in the first place. Luckily, this approach makes our lives much easier. While many cite the rapid growth of new computer/IT-oriented companies, venture capital, and computer and internet consumption, the paper Capital Gains Taxes and Stock Return Volatility: Evidence from the Taxpayer Relief Act of 1997, offers a more defined timeline. The researchers argue that the Taxpayer Relief Act of 1997, which reduced the capital gains tax rate, created higher asset return risk and increased stock return volatility. They prove that this tax reduction was swiftly followed by rampant, greed-driven speculation in internet companies. The paper’s argument is defensible and compelling both qualitatively and quantitatively. Thus, for the purpose of this analysis I assumed that the start date of the dot-com bubble falls in line with the enactment of the Taxpayer Relief Act on August 5th, 1997. The end date was much simpler to ascertain. The NASDAQ Composite peaked at 5,132.52 on March 10th, 2000, and proceeded to fall by 78% in the following 2.5 years. For obvious reasons, I pegged the end of the dot-com bubble to this date.
  • As of January 6th, 2018, the market capitalization of bitcoin clocked in at approximately $254 billion, whereas the whole cryptocurrency industry is valued at $793 billion. Thus, bitcoin represents nearly a third of the entire crypto market. While that’s pretty compelling, market cap share isn’t enough to prove that Bitcoin’s performance is indicative of the entire market. There are also intangibles at play here. I’d argue that the notoriety of bitcoin alone speaks to its influence over the market. Bitcoin is the base cryptocurrency used to 1) store value and 2) buy other cryptocurrencies. Its performance and value effectively shepherds the ability to interact with the rest of the market. This is similar to how Ether is the gatekeeper for all technologies and currencies built on top of the Ethereum platform and network.
  • The basis of this analysis is centered around my core assumption that a company’s share price is fairly comparable to the price of a cryptocurrency. Both quantities are bought and sold in a speculative manner, prone to similar fluctuations, and similar in magnitude. The future of equity could also be distributed and transacted in cryptocurrencies, which further solidifies the analogy. In the end, I’m not interested in volumes or quantities, but rather growth and trajectory of the industries. I believe both quantities are indicative of the peaks and troughs of both markets.

Approaches I though about using, but didn’t…

  • 2 Sample T-Test: This test is geared towards understanding whether or not there is a statistical difference between each series’ mean values. I’m not sure this is valuable in our case, as a comparison of means would not give us an indication of similarity between the series.
  • Spectral Analysis: The approach here is to apply a Fourier transformation to time series data to find the spectrum of irregular time series data, which can be used to obtain a corresponding power spectral density. Applying the inverse Fourier Transform of the corresponding power spectral density will yield a complete data set with a uniform sampling rate. Usually, this approach only works for periodic data that is sampled at irregular intervals and unlikely to contain outliers. Since our analysis focuses on comparing two outliers and we’re using a fundamentally uniform and non-periodic dataset, this probably wasn’t the right way to go. I’m also not sure this approach could be use to extend an end of one time series to the size of the other time series.
  • ARIMA: I strong considered using ARIMA models to analyze this data. So much so, that I even took tangible steps to validate this approach. The first step I took was to figure out which ARIMA model we want to choose. I started with the Autoregressive (AR) Model. This model is a linear model that predicts the present value of a time series using the immediate prior value in time. To determine whether the AR model is appropriate, I plotted stock prices (x[t]) against their single lag values (x[t-1]), and did the same for bitcoin prices. The plots can be found below:
  • There’s a strong correlation between stock price values against the lag1 series (R² = 0.92). Similarly, there was a strong correlation (R² = 0.94) between bitcoin price values against the lag1 series (unplotted). This indicates that an AR model might be a good approach, because it indicates that we can predict any value at present time using a value at the previous time. But I paused here. If we use one of the ARIMA models for both series, how will we compare them? Compare resulting model parameters? Is there an accepted statistical approach for this? I wasn’t so sure. In the end, this Cross Validated exchange shifted my attention away from ARIMA

Things I would change with more time (in no particular order)…

  • Derived a more analytical approach for finding start and end dates of both time series.
  • Dove deeper into the differences in scale
  • Expanded to include more cryptocurrencies than Bitcoin

Methodology

As mentioned, I’ve posted all my source code to Github here. In addition to the hyperlinks embedded throughout, I want to call out the following data sources, tools, and resources that were used to support this analysis:

Data sources:

Tools and Resources:

This story is published in The Startup, Medium’s largest entrepreneurship publication followed by 290,182+ people.

Subscribe to receive our top stories here.