Get startedGet started for free

Split the data

In this exercise, you will split your data into training and test sets using the caret package. In the next set of lessons, you will use the training set to build logistic regression models and use the test set to validate these models.

This exercise is part of the course

HR Analytics: Predicting Employee Churn in R

View Course

Exercise instructions

  • Load the caret package.
  • Set a seed of 567 and create a data partition that divides the dataset emp_final into 70% / 30% train/test sections.
  • Create the training dataset by selecting the row numbers stored in index_train from the dataset emp_final.
  • Assign the remaining observations from emp_final to the testing set.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Load caret
___

# Set seed of 567
___

# Store row numbers for training dataset: index_train
index_train <- ___(emp_final$turnover, p = ___, list = FALSE)

# Create training dataset: train_set
train_set <- emp_final[___, ]

# Create testing dataset: test_set
test_set <- emp_final[___, ]
Edit and Run Code