Simple Use of Standard Deviation
As all of us aware about the standard deviation which is nothing but a metrics to find how spread is our data set from its mean. Is it simple? Was not for me when I read the same statement first time, let’s dive it with some intuition to understand the overall understanding of Standard deviation using some simple example
Example 1
I like cricket that’s why presenting this example, this example works for interpreting any other example within your own favorite game, suppose you have two cricket batsmen and their batting score as follows below in the table — be mindful this batting score belongs to one tournament e.g. World Cup Semi Final from 1999–2015
All the above scores are arbitrary
I like both batsmen and they are great for their all playoffs season, they belong to the same team, if you see Mean (Average Score) then you will find both has the same mean (Average Score 85), this statistics does not allow us to compare their overall capability for specific situation. By seeing this statistics we can say both are at same level for all season’s situation, standard deviation tell us when and at what situation who is the best in their season
Based on above figure we can say that Misbah is more consistent player than Afridi. This spread/variability/variance of data tells consistency and inconsistency of data and is called Standard deviation, it tells you how far away each batsmen score from their mean 85
We use the following formula to find the standard deviation of data
Where Σ (sigma) means sum “sum of” , x is the value of data set , x-bar (x̄) is the mean of data set, and n is the number of data points
Here x in our example is each batsman score
Afridi
SD=√250/4
Afridi SD =7.90
Misbah
SD=√90/4
Misbah SD= 4.74
As you can see Misbah is more consistent to his average score than Afridi, his Standard deviation is 4.74 which means he is more close to his average score 85. Afridi is less consistent to his average score than Misbah as his Standard Deviation is 7.90 which is far from his average score 85
Let’s suppose in one of playoff season (World Cup Semi Final) their team requires 79 runs and team Captain has option whether to send Afridi or Misbah for the batting
If he sends batting to Afridi then his variance range from average score is between 77.1 and 92.9 (85 ± 7.90) , in other case if he sends batting to Misbah then his variance range from average score is between 80.26 and 89.74 ( 85 ± 4.74 )
In Afridi case his lower range band 77.1 is a bit risky to chase required run 79, in case of Misbah’s lower range 80.26 and upper range 89.26 covering the required run 79. It makes sense based on above statistics and situation Misbah is best candidate for this situation. In short Standard deviation give us a uni-variate variables predictability power to compare data, we can say that Standard deviation help us to generate a hypothesis based on given data
Here , I have mentioned the predictability power using standard deviation for specific situation — I am not saying using standard deviation you can compare which one has more variation in their batting score, but which one has more variation for a specific season.
A very basic practical usage of standard deviation in regression analysis where difference in standard deviations of observed values versus predicted values as shown by points in a regression analysis called Residual standard deviation , Residual standard deviation is also known s the standard deviation of points around a fitted line. The closer fitted line standard deviation to 0 the better the fit of regression line for the predicted values
Resources
https://www.investopedia.com/terms/r/residual-standard-deviation.asp