1. Skewness, kurtosis and the Jarque-Bera test
There are many numerical tests of normality, but the one that you are going to use - the Jarque-Bera test - is based on measures of skewness and kurtosis.
2. Skewness and kurtosis
Skewness and kurtosis are two further moments of a distribution, like the mean and standard deviation.
The formulas are given on the slide. Recall that mu hat and sigma hat are the sample mean and sample standard deviation.
The skewness of a distribution, as the name suggests, is a measure of its asymmetry. The skewness of a normal distribution is zero.
One way of interpreting the kurtosis of a distribution is as a measure of heavy-tailedness, that is, the tendency of the distribution to generate extreme values.
The kurtosis of a normal distribution is three. If the kurtosis of return data is greater than 3, this tends to show that the data are heavier tailed than normal with more extreme values but also with a much narrower center. Such a distribution is also called leptokurtic.
Recall the picture of the FTSE returns. When compared with the normal, the histogram had long tails and a very peaked center. This is a classic example of a leptokurtic distribution.
3. Skewness and kurtosis (II)
Let's go ahead and calculate the skewness and kurtosis of the FTSE returns using functions in the moments package.
As you can see, the value of the skewness is quite modest; the data aren't particularly asymmetric. However, the kurtosis is considerably more than 3, as we would expect from the picture.
4. The Jarque-Bera test
The Jarque-Bera test of normality is based around a test statistic that simultaneously compares the skewness and kurtosis of the data with their values for a normal distribution, that is, 0 and 3.
So the test can detect departures from normality caused by asymmetry, heavy tails, or a combination of both.
The test statistic T is given on the slides and is compared with a chi-squared distribution with 2 degrees of freedom.
Let's apply it to the FTSE log returns.
In this case, the value of the test statistic is huge at 428, and the p-value, the estimated probability that such an extreme result could be observed if the data really were normal, is effectively zero.
In statistical language, the hypothesis of normality can be rejected.
5. Longer-interval and overlapping returns
That was an analysis of daily log-returns. In the first chapter you saw how weekly, monthly, quarterly or other longer-interval log-returns could be constructed from daily returns by simply adding them up.
But now recall the central limit theorem. As you add up iid variables the distribution of the sum gets closer and closer to a normal distribution.
Of course you don't necessarily know that returns are independent, but the main idea of the CLT, the convergence to normality, actually holds in many situations where data aren't independent.
So it might be expected that longer-interval returns are more normally distributed. In the exercises you will investigate whether this is true.
One thing to note is that when you aggregate over longer and longer intervals you get less and less observations to analyse and the tests of normality become weaker.
So another thing you are going to try is calculating moving sums of daily returns. This gives so-called overlapping returns. It preserves the quantity of data but does introduce strong serial dependencies which can complicate interpretation.
6. Let's practice!
Now its time for you to try all of these things out.