KFold cross validation

When working with ML models, it's essential to evaluate their performance on unseen data while ensuring that. One common technique for this purpose is k-fold cross-validation. In this exercise, you'll explore how the k-fold cross-validation technique splits a dataset into training and testing sets. KFold is imported for you, as well as the heart disease dataset features heart_disease_df_X.

Deze oefening maakt deel uit van de cursus

End-to-End Machine Learning

Cursus bekijken

Oefeninstructies

Create a KFold object with n_splits=5, shuffle=True, and random_state=42
Split the data using kfold.split()
Print out the number of datapoints in the train and test splits

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Create a KFold object
kfold = ____(____, ____, ____)

# Get the train and test data from the first split from the shuffled KFold
train_data_split, test_data_split = next(____.____(____))

# Print out the number of datapoints in the train and test splits
print("Number of training datapoints in heart_disease_df_X:", ____)
print("Number of training datapoints in split:", ____)
print("Number of testing datapoints in split:", ____)

Code bewerken en uitvoeren