LoslegenKostenlos loslegen

Regularization

Regularization is the process of adding information to a model in order to prevent overfitting. This is important in order to boost the evaluation metrics you saw earlier in the chapter. In this exercise, you will vary around the max depth parameter of a decision tree in order to see how the classification results are affected.

X_train, y_train, X_test, y_test are available in your workspace. pandas as pd, numpy as np, and sklearn are also available in your workspace. Additionally, confusion_matrix(), precision_score(), and recall_score() from sklearn.metrics are available.

Diese Übung ist Teil des Kurses

Predicting CTR with Machine Learning in Python

Kurs anzeigen

Anleitung zur Übung

  • Create different decision trees by varying the maximum depth of each tree.
  • For each tree, fit and produce predictions on testing data.
  • Evaluate the confusion matrix, precision, and recall for each tree.

Interaktive Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

# Iterate over different levels of max depth
for max_depth_val in [2, 3, 5, 10, 15, 20]:
  # Create and fit model
  clf = ____(____ = max_depth_val)
  print("Evaluating tree with max_depth = %s" %(max_depth_val))
  y_pred = clf.fit(____, ____).predict(____) 
  
  # Evaluate confusion matrix, precision, recall
  print("Confusion matrix: ")
  print(____(y_test, y_pred))
  prec = ____(____, ____, average = 'weighted')
  recall = ____(____, ____, average = 'weighted')
  print("Precision: %s, Recall: %s" %(prec, recall))
Code bearbeiten und ausführen