CommencerCommencer gratuitement

Comparing the halves of your dataset

Just as you inspected the features of your full dataset, it's important to examine the halves after you've split the data. You can always use describe() on each dataset, but the psych package also provides some functions to help compare a dataset according to a grouping variable.

In this exercise, you'll use the indices created when splitting the dataset to create a grouping variable and attach it to the gcbs dataset. Once that grouping variable is set up, you can use describeBy() and statsBy() to view basic descriptive statistics as well as between-group statistics.

A word of warning: while the group argument of describeBy() has to be a vector, the group argument of statsBy() has to be the name of a column in your dataframe. Plan accordingly!

Cet exercice fait partie du cours

Factor Analysis in R

Afficher le cours

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Use the indices from the previous exercise to create a grouping variable
group_var <- vector("numeric", nrow(gcbs))
group_var[___] <- 1
group_var[___] <- 2
Modifier et exécuter le code