Get startedGet started for free

Welcome to the course!

1. Welcome to the course!

Hi and welcome to Foundations of Inference. I'm Jo Hardin, I'm a professor of math and statistics at Pomona College, and I'll be your instructor for this course. I'm assuming that you've already worked through the first few courses in this intro stats series. In this course, you will be building on your previous work to now make inferential, instead of descriptive, claims based on the data at hand.

2. What is statistical inference?

Statistical inference is the process of making claims about a population based on information from a sample of data. Typically, the data represent only a small portion of the larger group which you'd like to summarize.

3. What is statistical inference?

For example,

4. What is statistical inference?

you might be interested in how a drug treats diabetes.

5. What is statistical inference?

Your interest is in how the drug treats all people with diabetes,

6. What is statistical inference?

not just the few dozen people in your study. At first glance, the logic of statistical inference seems to be backwards, but as you become more familiar with the steps in the process, the logic will make much more sense.

7. Assume two populations prefer cola at same rate

Consider a situation where you are trying to convince your marketing director that people on the East Coast prefer cola versus orange soda at a higher rate than people on the West Coast. To make the argument, the first step is to assume that the two populations, East Coast people and West Coast people, prefer cola to orange soda at the same rate. Here, about 60% of all people prefer cola and 40% prefer orange soda.

8. The sample data

The second step in the process is to investigate the sample data and attempt to argue that the data at hand are nothing like that which would be collected had the populations really been identical with respect to soda preference. Here, soda preference is equal in the samples and the population.

9. The sample data (take 2)

Here, however, the sample from the East Coast prefers cola at a rate which is twice as high as that from the West Coast. With large samples, if the data are extremely different from the equal populations model, we can assume that the equal populations assumption is invalid.

10. Vocabulary

At this point, it is important for you to know some new vocabulary to describe the previous setting. The claim that is not interesting is called the null hypothesis and is denoted "H-naught". For example, that soda preference is the same on the two coasts. The claim that corresponds to the research hypothesis is called the alternative hypothesis and is denoted by "H-A". For example, that people on the East Coast prefer soda at a higher rate than those on the West Coast. Almost always, the goal is to disprove the null hypothesis and claim that the alternative hypothesis is true.

11. Example: cheetah speed

Suppose you're conducting research to compare the average running speed of two different subspecies of cheetahs. The null hypothesis is that Asian and African cheetahs run at the same speed, on average. The alternative hypothesis is that African cheetahs are faster than Asian cheetahs, on average.

12. Example: election

Or consider a dataset collected to measure whether, in an election, candidate X will win the popular vote. The number of interest is the true proportion of votes that candidate X will receive. That number is a population measure. The null hypothesis is that candidate X will get half the votes in the population. The alternative hypothesis is that candidate X will get more than half the votes in the population, that is, will win the election.

13. Let's practice!

OK, now it's your turn to practice what you've learned.