Get startedGet started for free

KFold cross validation

When working with ML models, it's essential to evaluate their performance on unseen data while ensuring that. One common technique for this purpose is k-fold cross-validation. In this exercise, you'll explore how the k-fold cross-validation technique splits a dataset into training and testing sets. KFold is imported for you, as well as the heart disease dataset features heart_disease_df_X.

This exercise is part of the course

End-to-End Machine Learning

View Course

Exercise instructions

  • Create a KFold object with n_splits=5, shuffle=True, and random_state=42
  • Split the data using kfold.split()
  • Print out the number of datapoints in the train and test splits

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Create a KFold object
kfold = ____(____, ____, ____)

# Get the train and test data from the first split from the shuffled KFold
train_data_split, test_data_split = next(____.____(____))

# Print out the number of datapoints in the train and test splits
print("Number of training datapoints in heart_disease_df_X:", ____)
print("Number of training datapoints in split:", ____)
print("Number of testing datapoints in split:", ____)
Edit and Run Code