ComenzarEmpieza gratis

Create a cross validation plan

There are several ways to implement an n-fold cross validation plan. In this exercise, you will create such a plan using vtreat::kWayCrossValidation(), and examine it.

kWayCrossValidation() creates a cross validation plan with the following call:

splitPlan <- kWayCrossValidation(nRows, nSplits, dframe, y)

where nRows is the number of rows of data to be split, and nSplits is the desired number of cross-validation folds.

Strictly speaking, dframe and y aren't used by kWayCrossValidation; they are there for compatibility with other vtreat data partitioning functions. You can set them both to NULL.

The resulting splitPlan is a list of nSplits elements; each element contains two vectors:

  • train: the indices of dframe that will form the training set
  • app: the indices of dframe that will form the test (or application) set

In this exercise, you will create a 3-fold cross-validation plan for the dataset mpg.

Este ejercicio forma parte del curso

Supervised Learning in R: Regression

Ver curso

Instrucciones del ejercicio

  • Load the package vtreat.
  • Get the number of rows in mpg and assign it to the variable nRows.
  • Call kWayCrossValidation to create a 3-fold cross validation plan and assign it to the variable splitPlan.
    • You can set the last two arguments of the function to NULL.
  • Call str() to examine the structure of splitPlan.

Ejercicio interactivo práctico

Prueba este ejercicio completando el código de muestra.

# Load the package vtreat
___

# mpg is available
summary(mpg)

# Get the number of rows in mpg
nRows <- ___

# Implement the 3-fold cross-fold plan with vtreat
splitPlan <- ___

# Examine the split plan
___
Editar y ejecutar código