Try a 60/40 split

As you saw in the video, you'll be working with the Sonar dataset in this chapter, using a 60% training set and a 40% test set. We'll practice making a train/test split one more time, just to be sure you have the hang of it. Recall that you can use the sample() function to get a random permutation of the row indices in a dataset, to use when making train/test splits, e.g.:

n_obs <- nrow(my_data)
permuted_rows <- sample(n_obs)

And then use those row indices to randomly reorder the dataset, e.g.:

my_data <- my_data[permuted_rows, ]

Once your dataset is randomly ordered, you can split off the first 60% as a training set and the last 40% as a test set.

Get the number of observations (rows) in Sonar, assigning to n_obs.
Shuffle the row indices of Sonar and store the result in permuted_rows.
Use permuted_rows to randomly reorder the rows of Sonar, saving as Sonar_shuffled.
Identify the proper row to split on for a 60/40 split. Store this row number as split.
Save the first 60% of Sonar_shuffled as a training set.
Save the last 40% of Sonar_shuffled as the test set.

Regression Models: Fitting and Evaluating Their Performance

Classification Models: Fitting and Evaluating Their Performance

Tuning Model Parameters to Improve Performance

Preprocessing Data

Selecting Models: A Case Study in Churn Prediction

Exercise

Try a 60/40 split

Instructions