Aan de slagGa gratis aan de slag

Try an 80/20 split

Now that your dataset is randomly ordered, you can split the first 80% of it into a training set, and the last 20% into a test set. You can do this by choosing a split point approximately 80% of the way through your data:

split <- round(nrow(mydata) * 0.80)

You can then use this point to break off the first 80% of the dataset as a training set:

mydata[1:split, ]

And then you can use that same point to determine the test set:

mydata[(split + 1):nrow(mydata), ]

Deze oefening maakt deel uit van de cursus

Machine Learning with caret in R

Cursus bekijken

Oefeninstructies

  • Choose a row index to split on so that the split point is approximately 80% of the way through the diamonds dataset. Call this index split.
  • Create a training set called train using that index.
  • Create a test set called test using that index.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Determine row to split on: split


# Create train


# Create test
Code bewerken en uitvoeren