MulaiMulai sekarang secara gratis

Split into train and test

Now that we have a dataframe, we can apply standard techniques for modeling. In this exercise, you will split the data into training and test sets.

Latihan ini adalah bagian dari kursus

Predictive Analytics using Networked Data in R

Lihat Kursus

Petunjuk latihan

  • To ensure the reproducibility of your results, set a seed to 7, using set.seed().
  • Use the sample() function to sample two-thirds of the numbers from the sequence from the range of the total number of rows in studentnetworkdata. Name this vector index_train.
  • Create the training set by including the rows of studentnetworkdata that are stored in index_train and name it training_set.
  • Create the test set by excluding the rows of studentnetworkdata that are stored in index_train and name it test_set.

Latihan interaktif praktis

Cobalah latihan ini dengan menyelesaikan kode contoh berikut.

# Set the seed
set.seed(___)

# Creat the index vector
index_train <- sample(1:nrow(___), 2 / 3 * nrow(___))

# Make the training set
training_set <- ___[index_train,]

# Make the test set
___ <- ___[-index_train,]
Edit dan Jalankan Kode