Split into train and test
Now that we have a dataframe, we can apply standard techniques for modeling. In this exercise, you will split the data into training and test sets.
Bu egzersiz
Predictive Analytics using Networked Data in R
kursunun bir parçasıdırEgzersiz talimatları
- To ensure the reproducibility of your results, set a seed to 7, using
set.seed(). - Use the
sample()function to sample two-thirds of the numbers from the sequence from the range of the total number of rows instudentnetworkdata. Name this vectorindex_train. - Create the training set by including the rows of
studentnetworkdatathat are stored inindex_trainand name ittraining_set. - Create the test set by excluding the rows of
studentnetworkdatathat are stored inindex_trainand name ittest_set.
Uygulamalı interaktif egzersiz
Bu örnek kodu tamamlayarak bu egzersizi bitirin.
# Set the seed
set.seed(___)
# Creat the index vector
index_train <- sample(1:nrow(___), 2 / 3 * nrow(___))
# Make the training set
training_set <- ___[index_train,]
# Make the test set
___ <- ___[-index_train,]