1. Probabilistic logic and statistical inference
Imagine you measured the petal lengths of
2. 50 measurements of petal length
50 flowers of a certain species. Here is the ECDF of those measurements. From what you have just learned, you can compute
3. 50 measurements of petal length
the mean of those 50 measurements, and I'll annotate it on the ECDF with a vertical line.
That is useful, but there are millions of these flowers on the planet. Can you tell me the mean petal length of all of the flowers of that species?
4. 50 measurements of petal length
If I measure another 50 flowers, I get a similar, but quantitatively different set of measurements. Can you tell me what value I would get for the mean petal length if I measured yet another 50 flowers?
We just don't have the language to do that, without probability. Probabilistic reasoning allows us to describe uncertainty. Though you can't tell me exactly what the mean of the next 50 petal lengths you measure will be, you could say that it is more probable to be close to what you got in the first 50 measurements than it is to be much greater.
5. 50 measurements of petal length
We can go ahead and repeat the measurements
6. 50 measurements of petal length
over and over again.
7. Repeats of 50 measurements of petal length
We see from the vertical lines that we expect the mean to be somewhere between 4-point-5 and 5 cm. This is what probabilistic thinking is all about. Given a set of data, you describe probabilistically what you might expect if those data were acquired again and again and again.
This is the heart of statistical inference. It is the process by which we go from measured data to probabilistic conclusions about what we might expect if we collected the same data again. Your data speak in the language of probability.
8. Let's practice!
Let's do a few exercises exploring these idea, and then we'll come back to learn how to start speaking this probabilistic language.