1. The Normal Distribution
A statistical distribution is a way to represent the possible values and
their respective probabilities for a random variable. It describes the likelihood
of observing different outcomes or values of a variable in a population
or a sample. We say that data like stock returns are normally distributed
when more values are found closer to the mean, less values are found
further away from the mean, and half the values are below the mean and
half the values are above the mean. This gives us the well known
bell shaped curve that we can see here.
The standard deviation is used to measure how spread out the data is.
The more spread out the data, the wider the bell shaped curve and
the larger the standard deviation. A really useful property of the normal
distribution is that 68% of the values in the sample or population are within
one standard deviation of the mean. This means 34% lie one standard deviation
below the mean and 34% lie one standard deviation above the mean. Going
a bit further, 95% of the values lie within two standard deviations of
the mean and 99.7% of the values lie within three standard deviations of
the mean. Normal distributions tables can be used to show this information,
although Excel has some really useful normal distribution functions to make
our lives a bit easier. So here we are back in our Statistics
for Finance template workbook and I've scrolled down to the normal distribution
section. Now I want to introduce a function in Excel and it's called
the NORMSDIST function, and it's written like this, NORMSDIST function.
And what this does is it calculates the area under the curve, and remember
the area under the curve is the same as the probability,
the area under the curve from the far left hand side of the
normal distribution up to a certain number of standard deviations from the
mean. So if I start in the middle of my table here where
I've got zero standard deviations from the mean, zero standard deviations
from the mean is the mean, it's right in the middle of our
normal distribution. So if I use the NORMSDIST function
and if I take Z as my number of standard deviations from the
mean which is zero, and if I say true for cumulative distribution and
hit the enter once I close the bracket, I get 50%.
And that's because the area under the curve from the far left hand
corner of the normal distribution up to the mean is 50%.
Now remember, if I move one standard deviation above the mean, that's
between the mean and that point is 34%.
And so if I go one standard deviation above the mean, the area
under the curve from the far left hand corner up to that point
will be the 50% up to the mean plus another 34%
which means we have 84.13%. And if we go all the way to
four standard deviations above the mean, we get to 100% because in theory,
100% of the values for a normal distribution lie within four standard deviations
of the mean. So if I just have a look at this
point here, two standard deviations above the mean, I have 97.22. And we
can see that in the diagram and it's represented by the double asterisks.
So if I just do the double asterisks, you can see 97.72 is the
area under the curve from the far left hand corner up to that
point. The remaining area, the area under the curve to the right of
two standard deviations above the mean must be one minus the 97.72% which
is 2.28%. And so we can use the NORMSDIST function to find the
area under the curve. Now if I start from the far left hand corner,
four standard deviations below the mean, the area under the curve,
do you know what it would be? Have a guess while I type
out my NORMSDIST. Zero or minus four standard deviations from the mean and
we want the cumulative true. Now what do you think it is?
Did you say 0%? Now if you did, well done. Because the area
under the curve, if we start on the far left hand corner of
the curve, the far left hand corner is four standard deviations below the
mean so there's no area under the curve as yet. But as we
get closer to the mean, the area under the curve gets bigger and
bigger. And so, if I have a look at this 15.87, one standard
deviation below the mean, I've represented that on my diagram with a single
asterisk. We can see there from the far left hand corner up to
one standard deviation below the mean, the area under the curve is 15.87%.
So the NORMSDIST function is a great function to use in Excel to
find the area under the curve. But remember, it works in a certain
way. It finds the area under the curve from the far left hand
corner up to a certain number of standard deviations away from the mean.
2. Let's practice!