Introduction to Volatility, Part 2

Thales Academy
8 min readMay 4, 2022

--

Standard Deviation and Normal Distribution

In Part 1 of this series on Volatility we covered the basics of options and why it’s worth understanding how they are priced. We touched on the way financial models must consider randomness when calculating a price and how this could lead to gaps that can be exploited by traders. Now we can dive into the process of measuring just how intense this randomness is for a particular asset (volatility) and how that impacts the price of an option.

In this article we’ll cover standard deviation, a concept from statistics that is essentially interchangeable with volatility in finance. This will get a bit nerdy, but since Thales was a mathematician himself I think he’d be on board with this approach.

What is Standard Deviation and how is it related to Volatility?

Standard deviation is a way to measure how much a set of price movements are spread out (or deviated) from each other. As we saw in Part 1, this is the main factor when pricing an option, as the more dispersed a set of prices are the more likely that a larger price movement will take place at any given time. Once you have a measure of how volatile an asset is you can combine this with other factors that apply to an option (time to maturity, strike price) to arrive at a logical price.

I hope you’re taking notes

To find the standard deviation of a data set, you must find another metric known as “variance”. Variance calculates how far apart a single price will be from the entire data set’s average, but it does so in way that is really only helpful when comparing multiple data sets against each other. More on this later, but know that the variance of a data set is required before you can find the standard deviation.

So variance is a measure of the average distance to the average. Let’s break that down with a small example data set. I’ll start with 8 numbers: 10, 12, 15, 18, 11, 12, 14 and 20.

To find our mean (the average) for this data set, we take 10+12+15+18+11+12+14+20= 112, then dividing by 8 gives us a mean of 14.

Next, we would take each number minus our mean, so we end up with the following:

Once you have the numbers above you then square (multiply a number by itself) each one. This is actually quite interesting, as we end up taking the square root of our numbers (divide a number by itself) in a just a few steps later on, effectively negating this step. It may seem unnecessary (square a number, do some steps, then take the square root), but by squaring our results we are essentially removing any negative numbers. The result of multiplying two positives or two negatives will always equal a positive number. This way, the distance each value is from the mean is calculated as a positive number instead of ending up with positive and negative numbers countering each other. We can’t continue the process with the mathematical sum of our values; instead we want to look at it as entire distance travelled (traveling one negative unit of distance is still moving one unit). This is how we find the total distance of each data point from the average.

So once we’ve squared our results we have the following:

You then find the mean of these results: 16+4+1+16+9+4+0+36=86

86 / 8 =10.75, which is our variance.

This is the point where we take the square root of our result and remove any impact from squaring earlier on in the process. Once we’ve taken the square of our new mean we are left with the standard deviation.

So the square root of 10.75 = 3.27871926215

You can check my math here: https://www.calculatorsoup.com/calculators/statistics/variance-calculator.php

Variance vs. Standard Deviation

Great, so the variance for our set of 8 numbers is 10.75 and the standard deviation is 3.278719 but what does that even mean? What does 10.75 and 3.278719 tell us about our set of data?

First lets break down variance a bit more. Variance is a measure of our data set’s dispersion (how spread out it is), represented as a conceptual measure of the average distance each number will be from the mean. I say “conceptual” measure because variance is actually the average of the square of this distance from the mean, and since we haven’t taken the square root yet, this number is not in line with our data set. Variance is more of a mathematical concept than a useful measurement on it’s own. You can’t really look at your numbers 10, 12, 15, 18, 11, 12, 14 and 20 and gain much from looking at the 10.75 variance. But even though variance by itself is not really all that useful, it does have some value. You can compare the variance of different data sets to understand the difference in variation between them. If you had another 8 numbers and ran the same process, a higher variance would represent a higher degree of dispersion than the original data set. And of course you can use the variance to get the standard deviation of your data set, and this number is more easily understood.

Since standard deviation is the square root of variance, you end up with a number that is now back in line with our original data set. The standard deviation defines a “normal” range for a data point to end up in relation to the average. A data point that falls within 1 standard deviation of the average is considered normal and anything greater than that is an outlier. Our example data set 10, 12, 15, 18, 11, 12, 14 and 20 (with a mean of 14) has a standard deviation of 3.278719. Any number that is 3.278719 or less away from 14 can be considered normal. If we one of our data points, lets say 15, we know this is fairly normal (close to our average of 14) since it is within 3.278719 of our mean of 14. If we look at 20 we can say this is outside the normal range since it is more than 1 standard deviation (3.278719) away from the average (14).

A fun example I’ve found looks at a group of dogs and does an analysis of their heights using standard deviation. The dogs that fall within the standard deviation can be considered normal height, while the dogs that end up outside this number can be considered tall or short compared to the overall data set. You can imagine this with the recorded prices of an asset: a move within the standard deviation is quite normal, while a bigger one is less expected and therefore hints at higher volatility.

Now all we need to gain some insights into our data set is some kind of universal framework to help us organize our results, and for that we use normal distribution.

Normal Distribution: Visualizing Standard Deviation with the Bell Curve

So what is normal distribution? Normal distribution is a way to visualize the range of outcomes for your data. This model is arranged with the average in the middle and the range of outcomes displayed symmetrically with data points that end up closer to the mean being more common than data points landing farther away. Normal distribution is useful for organizing random data sets ranging from genetics to games of chance to the natural world. To visualize this distribution we use what is known as the Bell Curve. Yes, that Bell Curve.

The bell curve represents a visualization of normal distribution. For a normal distribution you can expect 68% of the values in your data set to fall within 1 standard deviation. And as you move further from the average you collect more and more possible outcomes. 2 standard deviations will contain 95% of results, 3 contain 99%, and so on. We can see this in action by looking at our example from earlier (3.278719). If you were to pick a random price from your data set, 68% of the time it would be within 3.278719 of the average of 14. We used a super small data set but it does check out. When we look at 10, 12, 15, 18, 11, 12, 14, 20; 5 of the 8 numbers fall within 3.278719 of our mean of 14 (67.5%) and you can imagine a larger data set would refine that percentage down even closer to 68%.

Summary

That was a lot to digest. Just remember that volatility is the main factor for deciding how much an option should cost, and this is just the standard deviation of that asset’s prices. And standard deviation is just the distance most normal data points will fall away from the average. The more data points/prices that fall outside this normal range the higher the volatility.

In the next installment I’ll break down more calculations needed to arrive at a price for a particular option. There’s even more nerdy math to get through, but this kind of knowledge wouldn’t be fun if it was easy to obtain. Hopefully this information helps traders feel more confident when analyzing options and eventually build up an intuition of what something should cost and how to find good opportunities. At the very least, just by making it this far, you’ve made Thales proud.

Read Part 1 of the series here.

[This article is meant to be used as an educational resource and is not intended to serve as financial advice.]

--

--

Thales Academy

Welcome to Thales Academy, our goal is to close the gap between new users and the Thales protocol. We want to make Parimutuel Markets accessible to all.