Get startedGet started for free

Two sample t-test

1. Two sample t-test

We will cover the two sample t-test which is useful for comparing two different group means.

2. Comparing agreeableness

Let's say we've run a personality survey for a group of people, and we find out that our friend also ran the same type of personality survey for a different group of people. Both of us asked our respective groups to rate their agreeableness, or a person's general willingness to agree to something. Now, we want to know if both groups gave different responses. We each have our average score of agreeableness for each group but we don't know if both average scores significantly differ.

3. Define two sample t-test

The two-sample t-test examines whether the means of two independent groups are significantly different from one another. It also helps us to determine whether those differences could have occurred by chance.

4. Assumptions for a two sample t-test

To determine the validity of a two-sample t-test, three assumptions have to be met. One, the two samples are independent. This means no individual has data in both groups A and B. Two, the sampling distribution is normally distributed. This can be assessed using the Shapiro-Wilk test from scipy-dot-stats. The Shapiro-Wilk test is a test of normality; it determines whether the given samples come from a normal distribution or not. If the calculated p-value is greater than point-zero-five, then the sample data is normally distributed. Three, both groups have equal variances. This can be completed with Levene's test, which determines whether two or more groups have equal variances. We'll use the levene function from scipy-dot-stat for this. If the returned p-value is greater than point-zero-five, then both groups have equal variances.

5. Survey results

Here are our two groups: the results from group A that we collected, and the results from group B that our friend collected. Each respondent in the sample assessed their personality score from one to seven. One means "not agreeable", while seven means "agreeable".

6. Independent groups

Let's assume that we cross-checked both tests and find that individuals from group A do not overlap with those from group B. The assumption that both groups are independent is met.

7. Normally distributed groups

When we run the Shapiro-Wilks test on both groups' agreeableness columns, both p-values are greater than point-zero-five, at approximately point-one-seven and point-seven-six respectively, therefore they are normally distributed.

8. Equal variances

When we run the levene function on both groups' agreeableness columns, we see that the p-value is greater than point-zero-five at approximately point-five-two, therefore there is no significant difference between the two variances.

9. Assumptions checked

All three assumptions have been met, so the two-sample t-test for our scenario can be used.

10. Two sample t-test with statsmodels

scipy-dot-stats provides the ttest_ind function to conduct the two-sample t-test for two independent samples that have identical averages and variances. Its syntax is: stats-dot-ttest_ind, with both groups as arguments.

11. Two sample t-test with statsmodels

Running both groups, we notice that the p-value, which is the second value in the output, is equal to point-four, which is greater than the significance level alpha value of point-zero-five. This implies that the average agreeableness of respondents in group A is statistically insignificant from the average agreeableness of respondents in group B. In other words, both groups are similar to one another.

12. Further analysis

If, for example, the community wanted to implement an innovative policy on housing, and the community consisted of groups A and B, at an average agreeableness score of approximately four, neither group is likely to favor or disfavor the new changes.

13. Let's practice!

Let's practice!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.