Get startedGet started for free

Excursion: Correlation

If you're familiar with statistics, you'll have heard about Pearson's Correlation. It is a measurement to evaluate the linear dependency between two variables, say \(X\) and \(Y\). It can range from -1 to 1; if it's close to 1 it means that there is a strong positive association between the variables. If \(X\) is high, also \(Y\) tends to be high. If it's close to -1, there is a strong negative association: If \(X\) is high, \(Y\) tends to be low. When the Pearson correlation between two variables is 0, these variables are possibly independent: there is no association between \(X\) and \(Y\).

You can calculate the correlation between two vectors with the cor() function. Take this code for example, that computes the correlation between the columns height and width of a fictional data frame size:

cor(size$height, size$width)

The data you've worked with in the previous exercise, international.sav, is again available in your working directory. It's now up to import it and undertake the correct calculations to answer the following question:

What is the correlation coefficient for the two numerical variables gdp and f_illit (female illiteracy rate)?

This exercise is part of the course

Intermediate Importing Data in R

View Course

Hands-on interactive exercise

Turn theory into action with one of our interactive exercises

Start Exercise