CommencerCommencer gratuitement

Cross validation

Cross validation is a technique that attempts to check on a model's holdout performance. It is done to ensure that the testing performance was not due to any particular issues on splitting of data. In this exercise, you will use implementations from sklearn to run a K-fold cross validation by using the KFold() module to assess cross validation to assess precision and recall for a decision tree.

X_train, y_train, X_test, y_test are available in your workspace. pandas as pd, numpy as np, and sklearn are also available in your workspace. KFold() and cross_val_score() from sklearn.model_selection are both available as well.

Cet exercice fait partie du cours

Predicting CTR with Machine Learning in Python

Afficher le cours

Instructions

  • Create a decision tree classifier.
  • Set up a K-Fold cross validation with four splits and assign it to k-fold.
  • Use k_fold to run cross validation using cross_val_score() to evaluate the precision and recall of your model (and not using recall_score() or precision_score()!).

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Create model 
clf = ____

# Set up k-fold
k_fold = ____(n_splits = 4, random_state = 0, shuffle = True)

# Evaluate precision and recall for each fold
precision = ____(
  clf, X_train, ____, cv = ____, scoring = 'precision_weighted')
recall = ____(
  clf, X_train, ____, cv = ____, scoring = 'recall_weighted')
print("Precision scores: %s" %(precision)) 
print("Recall scores: %s" %(recall))
Modifier et exécuter le code