Get startedGet started for free

Normalizing measurements

1. Normalizing measurements

To round out the course, let's discuss the importance of accounting for confounds when comparing images.

2. Analysis workflow

Shown here is a common multi-subject workflow, in which each subject has a raw image that we process, measure, and store in a common group dataset. In this model, each image is treated independently, but you are extracting the same metrics from each subject.

3. OASIS population

For this lesson, let's assume that we have measured the brain volumes for all 400 images in the OASIS dataset and stored them in a pandas DataFrame. Displaying a few random rows shows that we have information such as age, sex, brain volume, and skull volume available.

4. Hypothesis testing

With this data, we want to answer a crucial question: are the brains of men and women different? How would we test this?

5. Hypothesis testing

For this question, we can use a two-sample t-test. We will test the null hypothesis that the mean brain volumes of men and women are the same. The likelihood of the null hypothesis being true is assessed by calculating the t-statistic. If the magnitude of t is very far from 0, then we can reject the null hypothesis that the groups are the same. Fortunately, this function is already implemented in SciPy and is simple to evaluate.

6. Hypothesis testing

To run the t-test, we need the brain volumes for all of the males and females in our sample. Select the male values with df dot loc, then specifying the rows where "sex is male" and the column with "brain volume" values. For females, we change the selected sex to "female." To run the t-test, import the ttest_ind() function from SciPy's stats module. Then, pass the two vectors as our first and second populations. The results object contains the test statistic, which is quite high, and the p-value. The p-value corresponds to the probability that the null hypothesis is true. In this case, we have a large t-statistic and a low p-value, which suggests that there's a significant difference in gender brain size! Should we start writing it up?

7. Correlated measurements

Not so fast. We need to consider the context in which this measurement is acquired. Brains fill up skulls, and skulls are proportional to body size. If we compare brain and skull volumes, we find that they have quite a large correlation. Accounting for this shared variation is important if we want to state that there's something different about the brains of men and women and not just their body size.

8. Normalization

To account for this potential confound, we can normalize brain volume with respect to skull size by calculating the brain to skull ratio. We can then repeat our t-test using the normalized values for males and females. Reviewing the results leads to disappointment: our amazing finding was likely related simply to the fact that large people have larger brains, and there are more large men than large women.

9. Many potential confounds in imaging

Confounds are omnipresent in data science but can be especially pervasive in image analyses. If you are pooling data across multiple hospitals or sites, you must consider how image acquisition, digital storage, clinical protocols, and subject populations might be biased towards a particular result. In short, you may employ extremely sophisticated image analysis techniques, but if your analytical thinking is faulty, then your conclusions will be too!

10. Congratulations!

Congratulations on completing the course! We've covered a lot of material, and it only scratches the surface of the field. You should now have a foundation of knowledge you can use to tackle more advanced concepts, such as image classification, object detection, and convolutional neural networks.

11. Good luck!

Thanks for taking the course, and good luck on your last exercises!