Get startedGet started for free

Comparing the halves of your dataset

Just as you inspected the features of your full dataset, it's important to examine the halves after you've split the data. You can always use describe() on each dataset, but the psych package also provides some functions to help compare a dataset according to a grouping variable.

In this exercise, you'll use the indices created when splitting the dataset to create a grouping variable and attach it to the gcbs dataset. Once that grouping variable is set up, you can use describeBy() and statsBy() to view basic descriptive statistics as well as between-group statistics.

A word of warning: while the group argument of describeBy() has to be a vector, the group argument of statsBy() has to be the name of a column in your dataframe. Plan accordingly!

This exercise is part of the course

Factor Analysis in R

View Course

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Use the indices from the previous exercise to create a grouping variable
group_var <- vector("numeric", nrow(gcbs))
group_var[___] <- 1
group_var[___] <- 2
Edit and Run Code