Get startedGet started for free

Create a cross validation plan

There are several ways to implement an n-fold cross validation plan. In this exercise, you will create such a plan using vtreat::kWayCrossValidation(), and examine it.

kWayCrossValidation() creates a cross validation plan with the following call:

splitPlan <- kWayCrossValidation(nRows, nSplits, dframe, y)

where nRows is the number of rows of data to be split, and nSplits is the desired number of cross-validation folds.

Strictly speaking, dframe and y aren't used by kWayCrossValidation; they are there for compatibility with other vtreat data partitioning functions. You can set them both to NULL.

The resulting splitPlan is a list of nSplits elements; each element contains two vectors:

  • train: the indices of dframe that will form the training set
  • app: the indices of dframe that will form the test (or application) set

In this exercise, you will create a 3-fold cross-validation plan for the dataset mpg.

This exercise is part of the course

Supervised Learning in R: Regression

View Course

Exercise instructions

  • Load the package vtreat.
  • Get the number of rows in mpg and assign it to the variable nRows.
  • Call kWayCrossValidation to create a 3-fold cross validation plan and assign it to the variable splitPlan.
    • You can set the last two arguments of the function to NULL.
  • Call str() to examine the structure of splitPlan.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Load the package vtreat
___

# mpg is available
summary(mpg)

# Get the number of rows in mpg
nRows <- ___

# Implement the 3-fold cross-fold plan with vtreat
splitPlan <- ___

# Examine the split plan
___
Edit and Run Code