1. Cumulative Distribution and Inverse CDF
In this lesson, we will discuss calculating the cumulative distribution function or CDF and the inverse CDF under the multivariate normal distribution.
2. When do we need to calculate CDF and inverse CDF?
Let's discuss a univariate problem first. A supermarket ground coffee jar is advertised to contain 200 grams of coffee.
3. When do we need to calculate CDF and inverse CDF?
The manufacturer uses a machine to pour the ground coffee into jars and x, the amount of coffee the machine pours, is normally distributed with mu equals 210 grams and sigma equals 10 grams.
4. When do we need to calculate CDF and inverse CDF?
To find the proportion of jars containing less than the advertised amount of coffee, we need to calculate the probability of x taking the value 200 or less, given by the blue-shaded region in the plot.
5. When do we need to calculate CDF and inverse CDF?
The pnorm() function can be used to calculate this area, and the probability of 0 point 159 implies that 16 out of 100 jars contain less than 200 grams of coffee.
6. When do we need to calculate CDF and inverse CDF?
Suppose the manufacturer wants to know the maximum weight of the shipment such that 95 percent of the jars are below that weight. It is simply the value x naught where the CDF is 0 point 95. Alternatively, it's the inverse CDF at 0 point 95. Calculating the inverse CDF using the qnorm() function with arguments p equals 0 point 95, mean equals 210, and sd equals 10, we can interpret that 95 percent of the coffee jars will have less than 226 point 45 grams of coffee.
7. Cumulative distribution for a bivariate normal
Let's generalize the concept of cumulative distribution to a bivariate normal. The cumulative density at x equals 2 and y equals 4, is the volume of the bivariate density for x less than 2 and y less than 4. The illustration shows the volume as the specified slice of the overall bivariate density.
Direct calculation of the volume that gives us the probability requires multivariable calculus. Instead, we will use the pmvnorm() function to calculate the probability.
8. Cumulative distribution using pmvnorm
To calculate the CDF in the previous slide we use pmvnorm() by specifying arguments upper, mean, and sigma.
Notice that since the probability is calculated numerically, the error precision and completion messages are provided.
9. Probability between two values using pmvnorm
Suppose we want to calculate the probability of x lying between 1 and 2, and y lying between 2 and 4, within a rectangular area given by the red rectangle.
We can again use the pmvnorm() function, but now with arguments lower equals 1 and 2, and upper equals 2 and 4.
10. Probability between two values using pmvnorm
This gives us the probability 0 point 163 illustrated by the volume of the green mass on the right.
11. Inverse CDF for bivariate normal
Let's generalize the concept of inverse CDF to bivariate normals. Suppose we are interested in finding the smallest ellipse that contains 95 percent of the total volume of the bivariate normal.
The animation on the left shows the increasing elliptical contours, starting from the center, and the right panel shows the corresponding proportion of the bivariate volume contained within the ellipse. The contour which contains 0 point 95 probability is the same as the 0 point 95 quantile.
We use the qmvnorm() function to calculate these contours.
12. Implementing qmvnorm to calculate quantiles
If we wish to calculate the 0 point 95 quantile of a standard bivariate normal, first we construct a 2 by 2 matrix with diagonal ones and off-diagonal zeroes using the diag() function.
Then we use the qmvnorm() function, specifying arguments p, sigma, and tail. As the off-diagonal entries of the variance-covariance matrix are zero, the contours will become circular and the radius of the contour containing 0 point 95 probability is the attribute quantile in the output. The output also contains the error precision and completion message.
13. Let's practice!
Let's practice using the pmvnorm() and qmvnorm() functions.