BaşlayınÜcretsiz Başlayın

Split the data

A dataframe df_examples is available having columns endword: string, features: vector, outvec: vector, and label: int. You're going to split it to obtain training and testing set, which you will use to train and test a classifier.

Bu egzersiz

Introduction to Spark SQL in Python

kursunun bir parçasıdır
Kursu Görüntüle

Egzersiz talimatları

  • Split the examples into train and test using a 80/20 split.
  • Print the number of training examples.
  • Print the number of test examples.

Uygulamalı interaktif egzersiz

Bu örnek kodu tamamlayarak bu egzersizi bitirin.

# Split the examples into train and test, use 80/20 split
df_trainset, df_testset = df_examples.____((____), 42)

# Print the number of training examples
print("Number training: ", ____.____)

# Print the number of test examples
print("Number test: ", ____.____)
Kodu Düzenle ve Çalıştır