Get startedGet started for free

Other common Multivariate distributions

1. Multivariate t-distributions

Though the multivariate normal distribution is widely used for modeling multivariate data, not all multivariate processes follow a normal distribution. Similar to a univariate situation, the data might be heavy-tailed or skewed, which would require other multivariate distributions. In this chapter, we will discuss multivariate t-distributions, which can model heavier tails and are a generalization of the univariate Student's t-distribution. Later, we will focus on skewed multivariate distributions.

2. Parameters for multivariate distributions

First, you should be aware of some terminology used to specify non-normal distributions. Recall that the mean of a normal is the location parameter, and the variance-covariance matrix is the scale parameter. For functions specific to t-distributions in R, the location parameter is denoted by delta. The scale parameter uses the same sigma notation. In the context of multivariate skew distributions, which we will discuss later, the location parameter is denoted by xi and the scale parameter is denoted by Omega.

3. Parameters for multivariate distributions

Besides the location and scale parameters, t and Skew-t distributions need an extra parameter for the degrees of freedom.

4. Comparing univariate normal with univariate t-distributions

This animation compares the density shape of the univariate t density for degrees of freedom ranging from 0 to 30 in orange, with the standard normal density shown in blue. The t-distribution with 1 degree of freedom has the fattest tail and the most suppressed peak. The tails become narrower as the degrees of freedom increase until, at approximately 30 degrees of freedom, there is very little visible difference between the normal and the t-distribution.

5. Comparing normal and t-distribution tails

The animation and table here zoom in on the tail area and compare the probability that the tail area is greater than 1 point 96 or less than -1 point 96. For the normal distribution the probability is 0 point 05, whereas, for a t-distribution with one degree of freedom, this same tail area has a probability of 0 point 3. Even for 30 degrees of freedom, where the overall densities looked very similar, the probability of the tail area is slightly higher than 0 point 05.

6. Multivariate t-distribution notation

The multivariate t-distribution is a generalization of the univariate t-distribution and is denoted by t subscript df, delta and sigma, where df is the degrees of freedom across all dimensions.

7. Contours of bivariate normal and t-distributions

Now we will compare the contours of a bivariate t-distribution with 3 degrees of freedom with a bivariate normal with the same location and variance-covariance parameters. Notice that for both distributions all the contours are elliptical. The subtle difference in color in the outer rings of the two contour plots are due to the heavier tails of the t-distribution.

8. Functions for multivariate t-distributions

Similar to the multivariate normal, the mvtnorm library has four functions for multivariate t-distributions, rmvt(), dmvt(), qmvt(), and pmvt(). Instead of the mean, we now need to specify the non-centrality parameter delta along with the degrees of freedom, df. First, let's implement the rmvt() function.

9. Generating random samples

If the task is to generate 2000 samples from a 3-dimensional t-distribution with 4 degrees of freedom, we can use rmvt() with n equals 2000, specified values of delta and sigma, and df equals 4.

10. Comparing with normal samples

Using the same location parameter and variance-covariance matrix, the plot for tri-variate samples from a t-distribution with 4 degrees of freedom is on the left, and the normal distribution is on the right. The heavy tails of the t-distribution are clearly visible from all scatterplots where larger proportions of points are further away from the center of the distribution compared to the normal.

11. Comparing with normal samples

As the degrees of freedom increase to 8, the scatterplot becomes slightly more concentrated.

12. Let's generate samples from a multivariate t-distribution!

Let's practice generating samples from a multivariate t-distribution.