Get startedGet started for free

Inference for a mean

1. Inference for a mean

Great work so far! In this video, we will review Student's t-test. The t-test helps us make inferences about a population based on a sample.

2. Inference for a mean

A lot of companies reason from samples, so interviewers might be interested in checking your understanding of statistical inference. For example, pharmaceutical companies use statistical inference to assess the impact of a drug on all patients based on a limited number of observations.

3. Inference for a mean

In this lesson, we will make inferences about a population's mean based on a sample using a t-test. More specifically, we will calculate a confidence interval and test if the population's mean equals a given value.

4. Assumptions

The t-test assumes that the underlying data are normally distributed. Recall from the central limit theorem that the distribution of a statistic converges to a normal distribution if samples are large, even when the distribution of observations in each group is non-normal. Additionally, the t-test requires the sample to be random and the observations to be independent.

5. Confidence interval

Let's review confidence intervals. Imagine that we draw a sample from the following population.

6. Confidence interval

Knowing that the underlying data follows a normal distribution, we can calculate the confidence interval, which is the range where the population's mean lands with a given probability.

7. Confidence interval

With the increase of the sample's size, the confidence interval narrows,

8. Confidence interval

because we can estimate where the population's mean lands with the higher precision.

9. 95% confidence interval

Make sure that you can precisely define a confidence interval during the interview. Let's take a 95% confidence interval, for example. We have 100 different samples.

10. 95% confidence interval

For each of them, we compute a 95% confidence interval.

11. 95% confidence interval

We check if the confidence interval contains the true mean.

12. 95% confidence interval

Approximately 95 of the 100 confidence intervals will contain the true mean value. In practice, however, we select one random sample and generate one confidence interval, which may or may not contain the true mean.

13. One-sample t-test

We can also test if the population's mean amounts to a given value based on a sample. The null hypothesis of the one-sample t-test states that the population's mean equals a given value. The alternative hypothesis is that the two values differ.

14. t-test in R

To perform a t-test in R, you can apply the t.test function to sample data. The output of the test contains several pieces of information.

15. t-test in R

By default, the function tests if the population's mean amounts to zero. The output of the test contains the p-value. It's crucial to know how to interpret a p-value during the interview. Recall that the p-value is the probability of seeing sample data at least this extreme, assuming that the null hypothesis is true.

16. t-test in R

The function also prints out a 95% confidence interval.

17. t-test in R

You can change the hypothesized mean by setting the mu parameter. Note that the alternative hypothesis here is that the true mean is not equal to 2 rather than to 0.

18. t-test in R

To change the level of the confidence interval, set the conf.level parameter to the chosen value. Here, we set it to 90%.

19. Summary

Let's summarize. We've covered the assumptions of a t-test, confidence intervals, the one-sample t-test, and the t.test function in R.

20. Let's practice!

Let's practice using t-tests in R!