Session Ready
Exercise

Try a 60/40 split

As you saw in the video, you'll be working with the Sonar dataset in this chapter, using a 60% training set and a 40% test set. We'll practice making a train/test split one more time, just to be sure you have the hang of it. Recall that you can use the sample() function to get a random permutation of the row indices in a dataset, to use when making train/test splits, e.g.:

n_obs <- nrow(my_data)
permuted_rows <- sample(n_obs)

And then use those row indices to randomly reorder the dataset, e.g.:

my_data <- my_data[permuted_rows, ]

Once your dataset is randomly ordered, you can split off the first 60% as a training set and the last 40% as a test set.

Instructions
100 XP
  • Get the number of observations (rows) in Sonar, assigning to n_obs.
  • Shuffle the row indices of Sonar and store the result in permuted_rows.
  • Use permuted_rows to randomly reorder the rows of Sonar, saving as Sonar_shuffled.
  • Identify the proper row to split on for a 60/40 split. Store this row number as split.
  • Save the first 60% of Sonar_shuffled as a training set.
  • Save the last 40% of Sonar_shuffled as the test set.