ComeçarComece de graça

Create a cross validation plan

There are several ways to implement an n-fold cross validation plan. In this exercise, you will create such a plan using vtreat::kWayCrossValidation(), and examine it.

kWayCrossValidation() creates a cross validation plan with the following call:

splitPlan <- kWayCrossValidation(nRows, nSplits, dframe, y)

where nRows is the number of rows of data to be split, and nSplits is the desired number of cross-validation folds.

Strictly speaking, dframe and y aren't used by kWayCrossValidation; they are there for compatibility with other vtreat data partitioning functions. You can set them both to NULL.

The resulting splitPlan is a list of nSplits elements; each element contains two vectors:

  • train: the indices of dframe that will form the training set
  • app: the indices of dframe that will form the test (or application) set

In this exercise, you will create a 3-fold cross-validation plan for the dataset mpg.

Este exercício faz parte do curso

Supervised Learning in R: Regression

Ver curso

Instruções do exercício

  • Load the package vtreat.
  • Get the number of rows in mpg and assign it to the variable nRows.
  • Call kWayCrossValidation to create a 3-fold cross validation plan and assign it to the variable splitPlan.
    • You can set the last two arguments of the function to NULL.
  • Call str() to examine the structure of splitPlan.

Exercício interativo prático

Experimente este exercício completando este código de exemplo.

# Load the package vtreat
___

# mpg is available
summary(mpg)

# Get the number of rows in mpg
nRows <- ___

# Implement the 3-fold cross-fold plan with vtreat
splitPlan <- ___

# Examine the split plan
___
Editar e executar o código