The Volatility of Volatility for Stocks and Crypto

A gentle introduction to the Heston model.

NTTP
Operations Research Bit
13 min readMay 5, 2024

--

An easy to generate distribution that’s not so darn normal: simple stochvol.

With Excel examples!

This is the Excel file. Get it!

https://github.com/diffent/excel/blob/main/volofvol2.xlsx

When you first hear the phrase “the volatility of volatility,” and if you are into mathematics and forecasting models, you can easily think: Man, that’s a cool idea. But as cool as it is, that phrase does not capture the full behavior and flexibility of these types of asset pricing models. In essence, this model is even more interesting (and intricate) than it seems. What we will do in this Part 1 article is demonstrate the first piece, the denoted “v of v,” before getting into other parts of the Heston model.

For now, we won’t even print the Heston model equations here because it may throw you off, and you may think that this is all more difficult than it really is. You can look them up if you’d like, but they can be a bit intimidating at first:

To review, historical volatility of an asset is just the standard deviation of observed returns. We will work with daily returns here, and we will just leave the volatility in these daily units. Take care that some published sources print annualized volatility and not what we are doing here.

For this exercise, we will not look at empirical returns data (right from the market), since we don’t want to confuse the reader unnecessarily. This will come in later articles (the empiricality, and also, possibly, the confusion). We will start with a theoretical asset that has zero drift bullish or bearish and has a standard deviation of 1 to make the formulas easier to grasp on first pass (let’s say that this 1 implies 1% (stdev) daily volatility). We can even think to name this asset! We could call it: A) the common stock of the fictional and über generisch GlobeCo firm, B) a new fancy crypto currency with trading symbol NORMY, or C) any other fanciful asset name you might think of.

The formulae

Download and open the sheet above if you want, and follow along!

We started in our Excel example by generating a standard normal distribution of random data via the formula in column A:

=NORMINV(RAND(),0,1)

[the above is not what is in the example sheet currently, but is for reference here to remind you: “how to generate a standard normal scattering of points in Excel.”]

To review, “standard” just implies mean = 0, standard deviation = 1.

The standard deviation of 1 is the final argument in this function.

Now you see in our example sheet column A, we have this (in A3 for example) as:

=NORMINV(RAND(),0,1*D3) [Equation 1]

We make 1000 points (rows) of this in column A for a nice round number count.

The result in D3 is the value by which we want to scale the nominal standard deviation of 1 (the third argument of NORMINV), but this D3 is also a formulaic value in Excel.

The formula in D3 is just a bounds limiter for the real formula in C3, because Excel’s NORMINV function complains if we set the stdev value to zero or less: A practical edge case guardrail for our formulas.

Now let’s look at C3 (this is the volatility of volatility generator that we will apply to our main random generator):

=NORMINV(RAND(),1,$I$3)

Once again we have a normally distributed random generator, but this time centered at 1 (the neutral multiplication factor), and we allow us, the users, to have control over the standard deviation of this distribution by changing I3.

So to review [Equation 1], the constant, non-random standard deviation argument of 1 in our original formula in the cells of column A is modified (scaled) by a random variable (D3). That is the basic “volatility of volatility” idea right there.

It’s interesting: When we first coded this up, instead of multiplying the original “returns” stdev by the new random variable, we added it (centering the new random generator at 0 instead of 1 so we get some adds and some subtracts from the main standard deviation, bumping it up and down as we go). Seems to also give an interesting model, but it is not the way the Heston model does it. Since we are trying to drive this story in the direction of H, we re-set the formula to the multiplier method. But: ideas…

We compute the first four distribution moment linked metrics of the resulting data from column A and put them in column E for reference.

Figure 1: Upper left cell is E3 in the example sheet. This is just re-computation for checking our random returns generator in column A with I3 stochvol set close to zero (e.g. no stochvol)

Additionally, we plot a dynamic histogram of the 1000 data points in column A that gets recomputed automatically by Excel’s data analysis system:

Figure 2: Since we are only generating 1000 points, this does not show a symmetric standard normal distribution exactly.

“Winding up” returns into prices

Pretending that column A is a daily asset return in percent (with zero drift over time… mean 0 in our formula), we can compute the “price” series generated by this in column B by the ordinary method: Merely multiplying each day’s 1-centered return by the prior day’s price, starting the prices at an arbitrary value of 1 dollar in B2. See the formulas in column B for the basic details.

Figure 3: Typical price series generated with stdnorm returns, no drift, no stochvol. Plot of column B. X is “trading days.” Note: top left title should say “Price time series”

We also plot the returns themselves as a time series in the sheet for reference, but we do not need to show that here. You’ll see it if you open the sheet!

Baseline model, zero stochastic volatility

Let’s look at some examples: First, trying a reference case, we set I3 close to 0. If we use exactly 0 for the stdev argument of NORMINV, NORMINV doesn’t work… so just keep I3 above zero.

Aggregate results are in column E [Figure 1 above].

(Your values may vary slightly, since this is a non-converged random generator sheet).

Mean is close to 0, stdev is close to 1, skew (bullish/bearish) is close to 0, and so is kurt(osis) (tail fatness). These values would be closer to their theoreticals if you added more datapoints (rows) to the model output. You can try this yourself later. (Of course you need to adjust the formulas in column E to make sure to pull those additional points if you do this).

We plot the histogram of the column A “fake company” daily returns so that you can see how they are distributed over the 1000 fake trading days [Figure 2].

Every re-generate of the sheet (Shift F9 on a PC in Excel… or just change any cell that doesn’t have anything important in it to regenerate the sheet) generates new random values, so you will see the distribution stats float a bit around their theoretical targets, since we only are using 1000 points in this example.

Since our reference case is setting the volatility of volatility (I3) to close to zero, we would expect the resulting price time series(es) to behave like numeric approximations to ordinary Brownian motion (normal distribution driven motion), which it seems to do at first glance. Not that we can easily tell Brownian motion from any other kind of randomized motion by eye, but it seems like a reasonable plot (Figure 3).

Cranking up the stochvol

To make things easier to discuss, instead of volatility of volatility, let’s just call I3 “stochvol” for short.

We next crank up this stochvol in I3 to be 1 (to the same level as the ordinary volatility).

Now we see from our histogram (of column A data) that the artificially generated returns here (from our vol-of-vol generator equation in column A) start to seem more like what we observe in actual stock or crypto returns: the return distribution has fat tails!

Our metrics show this as well: Kurtosis (tail fatness) is large, and stdev is now significantly above 1. I guess we would expect skewness to be close to its normal distribution value of 0 still since we are not putting our thumb on the scale of the random generator, either bullish or bearish. We are merely widening (or narrowing) the base random generator in column A by another uncoupled random value.

The uncoupled nature of this 2nd random generator is important to note, because, in the more advanced Heston model, the 2nd generator is coupled to… let’s just say …“other things,” for now.

And not just widening (which could be accomplished merely by increasing the standard deviation of the generator), but also changing its shape.

Here are some snapshots of particular regenerates of the artificial price trace with this higher level of stochvol:

They look suspiciously like real stock market price series data, don’t they? In some time windows, they are bullish (consistently going up), even though there is absolutely no skew or mean “tilt” (positive or negative) in our random generators. Sometimes you just get lucky. In some time windows, they are bearish (consistently going down). In some time windows, they wander around and act like “oscillators” (“return to mean”), to use the vernacular. They have a kind of “rough” look to them. But as we note in a prior article, don’t try to do “technical analysis” on these charts and figure out patterns in these… because they are completely random. If your technical analysis results on these proceed to give good forecasts, it will be by random chance.

If you “read the charts” on a regular basis, do these price charts seem more realistic than the Brownian motion type chart in Figure 2, or not? Some jumps and leaps, some big drops on given days. Was this due to “news” or other new information that investors received? Not in our case here. There is no news involved, and there is no connection to the real world. These are strictly randomly generated paths. Yet, they look so… real!

The reveal

Now that you see the simplicity of this first part of a stochvol model, we will unveil the first part of the Heston stochvol model equations. Use your iPad as a shield so you don’t get too scared, and just peek around it. Have iPad Safari up and have it set to view nytimes.com. That way, it will block most rational thought from coming though. Kidding! I’m kidding! We love the New York Times!

First part of the Heston model

See, this is why we didn’t paste this equation here first, or you might just go “click, bommm…” (sound of old landline phone hanging up) and scroll past this article. And this is just the first part of it, there’s more. And we already simplified it by leaving out the left term entirely (the mean “drift”) and just set that to zero in our model. And yet… those plots we printed above look pretty darn good as stock market models, don’t they?

Caution

One thing to note is that this is a stochastic differential equation, which describes how a probability distribution moves through time, and not how a single price point moves thru time (as in the single time series(es) that we are are generating here in our Excel file). Excel is generating these traces one at a time (every time you refresh the sheet), but the formal analytic equation here describes how a whole aggregation of these generated time series would move thru time. This aggregation we approximate in our MCarloRisk3D apps (described in our earlier articles) to model how a price/probability distribution of a real asset [stock or crypto] moves through time. We will eventually try to do the same thing with this stochvol model in our Excel file (in future followups to this article), and, as a matter of fact, the first prototype of our MCarloRisk3D app was written in Excel! [No stochvol math in that sheet.] But first (in future articles) we will add other important terms to the stochvol model and let you see how they affect individual price traces.

The radicand

Eagle eyed readers will notice the square root symbol over v sub t in the above complicated looking formula (our stochastic volatility). I don’t think it’s really v, I think it is Greek nu, which options traders like to call vega, but a different vega than they are used to, so… anyway. Just v sub t here. We didn’t add this square root into our Excel formulas yet, since it adds nothing to the understanding of the volatility of volatility concept in this first part of our description… but it is easy to add (which we will do in the next article in this series). And it will likely make our model better, if we believe Prof. Heston & Co.

Of course you can try it yourself. Just wrap SQRT around the formula in column C. If you do this, you have to take care that the values you are “square rooting” (the “radicands,” to use an excellent word that I was reminded of recently — did I ever even know it, or am I just imagining that I did? — in this story by another mathematical-focused Medium writer) are non-negative, or Excel gives bad results. We don’t want to inject imaginary numbers into our price model [which imaginaries would result from taking the square root of a negative number… though Excel may just puke on this request. Well, let’s see… it gives #NUM! results, FYI]. You can avoid this by using smaller values in I3 (the stochvol setting) or by other means. You can also look up various Heston references to learn why that square root is there… it may just be because it seemed to fit better with observation than doing the scaling without the root, or there may be deeper meaning to it based on some theoretical concept. This author doesn’t recall the answer to this question offhand, but, we will post it here when we find out the answer, if we find out the answer. In our example model here untethered from any real data, all the SQRT would do is change the shape of the stochvol distribution by which we scale the original returns distribution. Since probability distributions have (Y) values always <= 1, the square root function magnifies them in a nonlinear manner.

Where applying the square root function magnifies (x < 1) or attenuates (x > 1). Smaller values get magnified more than values closer to 1.

As an example, applying the square root to a standard normal distribution shape causes it to be widened and have fatter tails:

How the standard normal distribution would get modded when we take the square root of it. It gets wider with fatter tails.

In our MCarloRisk3D app in the stochastic volatility section of the Tune panel, we have a feature to allow any power to be applied to the base vol-of-vol (e.g.: why 0.5 square root? why not 0.4? 0.6?). Once you build a model, there are enticing tuning parameters that you can think of to let you adjust the model closer to observations that you make from reality.

The Tune panel in our MCarloRisk3D app with stochvol features opened up (right panel). This also gives a hint at other Heston model parameters that our app supports, and which we will get into some in the next of this series of articles.

So, anyway, that’s what the square root seems to be doing there… it is making the effect of the stochvol multiplier stronger at the extremes than if we were to use a thinner-tailed normal distribution for this stochvol modifier.

Hints of the next phase

This is not the end of our story, for as I hinted at the beginning of this article, “the volatility of volatility” is only part of what is going on in the Heston model. No, what stochvol models like this also have in them are links between this vol-of-vol and other events that occur in reality, such as directional change of price. This is to account for observations such as: assets tend to be more volatile when they are going down in price than when they are going up in price. There is also a “stickiness” factor involved… an asset which gets more volatile tends to stay volatile for a while and doesn’t immediately revert to long term more constant behavior. Wow, right? A lot of interesting stuff to think about besides theoretical Brownian motion… where all asset returns are normally distributed, “and all the children are above average,” as they are claimed to be in the small Minnesota town Lake Woebegon.

References

[1] We snap-shotted the first equation of the Heston model equation from: Heston Stochastic Volatility Model with Euler Discretisation in C++, https://www.quantstart.com/articles/Heston-Stochastic-Volatility-Model-with-Euler-Discretisation-in-C/

Additional reading

[2] Find part 2 of this series here (published May 10 2024): https://medium.com/@nttp/the-volatility-of-volatility-part-2-97c01392353c

[3] If you want to read ahead on the application of this model type in our MCarloRisk3D app using real data, check out our white paper for a case study example applying this model to Bitcoin price modeling: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3693387

[4] MCarloRisk3D software can be found on the macOS and iPad app stores, along with the store.microsoft.com Windows app store. However, the Windows version is behind the Apple versions because MS pulled the rug out from under Project Islandwood which we used to port the code over to Windows: https://blogs.windows.com/windowsdeveloper/tag/project-islandwood/

“Wow, what a great project!” [Let’s get rid of it.] Yet for some reason they have the resources to put a cartoon in my search bar. What is that? Why is that?

Cartoons, muscling onto the turf of the Windows search bar.

--

--