Density and cumulative density

1. Density and cumulative density

When you flip a fair coin ten times, what's the most likely number of heads? Well, since heads and tails are equally likely, you can probably figure out that the most likely outcome is that 5 come up heads, 5 tails. Say I offer you a bet: if it is exactly that result, I'll pay you a dollar, otherwise you'll pay me a dollar. Should you take the bet?

2. Simulating many outcomes

To answer this, we'll have to find the probability a binomial variable X with these parameters- ten flips, each with a 50% probability- results in an outcome of 5. We would express this as "Pr X equals 5." One way to find out is to simulate many draws from X- say, a hundred thousand: and then see how common each outcome is. As you saw in the last exercises, you can choose the number of simulations to perform by setting the first argument of rbinom. This resulting variable flips then contains the results of these one hundred thousand draws. We can't print out all these results, at least not in a way we'll understand them. Instead, we can visualize them in a graph. This plot is called a histogram: each bar shows the relative frequency of one outcome, from 0, 1, 2 all the way through 10. Histograms are a common way to examine a probability distribution, and we'll be using them frequently throughout the course. Notice that out of these hundred thousand draws, about twenty five thousand are equal to 5.

3. Finding density with simulation

There's a useful trick in R for calculating the fraction equal to 5 directly. The expression flips == 5 compares each item in the vector to 5. We can then use the mean() function to find the fraction of comparisons that are TRUE. This works because the mean function treats TRUE as 1 and FALSE as 0. Thus, "mean flips == 5" gives the fraction of values equal to 5. You're going to be using this trick with mean a lot in these exercises whenever you estimate values through simulation. In this case, we found that the fraction of outcomes equal to 5 was point-2463: that is, there's a 24.6% chance. This is called the density of the binomial at that point.

4. Calculating exact probability density

Simulation is a very useful way to understand a distribution and to answer questions about its behavior. But in the case of the binomial distribution, R also provides a way to calculate the exact probability density, using the dbinom function. dbinom takes three arguments: the outcome we're estimating the density at, 5, the number of coins, 10, and the probability of each being heads, point-5. Notice that this gives a result of point-246: this confirms the result from our simulation, that the probability is about 24.6%. Similarly, we could calculate the probability density of getting exactly six heads, or all ten coins being heads, by changing the first argument. Finding a probability through both simulation and exact calculation will be a common task in this course.

5. Cumulative density

So now you know not to take my bet: more likely than not, I won't get exactly 5 heads out of 10. What if I offer a new one? I'll pay you a dollar if 4 or fewer come up heads, otherwise you have to pay me. This describes the cumulative density of the binomial, the probability X is less than or equal to 4, and the process for calculating it is similar to calculating the density. You can estimate it using simulation: we'd generate a hundred thousand draws from the binomial distribution, then instead of using "equals equals", we would do "mean flips less than or equal to 4". We can see that this was true of about 37.7% of the simulations. Much like the density, R provides a function to get the exact cumulative density of the binomial. Rather than dbinom, use pbinom. This result confirms that the probability is about 37.7% that a binomial with ten flips gets 4 or fewer heads. In other words, you still shouldn't take my bet. In your exercises, you'll find the density and the cumulative density of several other binomial distributions, using both the simulation approach and the dbinom and pbinom functions.

6. Let's practice!