Modelselectie

Zowel regularisatie als cross-validatie zijn krachtige tools voor modelselectie. Regularisatie helpt overfitting voorkomen en cross-validatie zorgt ervoor dat je modellen op de juiste manier worden geëvalueerd. In deze oefening gebruik je regularisatie en cross-validatie samen en kijk je of modellen significant van elkaar verschillen. Je berekent alleen de precision, al kun je dezelfde aanpak net zo goed toepassen op recall en andere evaluatiematen.

X_train, y_train, X_test, y_test zijn beschikbaar in je werkruimte. pandas als pd, numpy als np en sklearn zijn ook beschikbaar in je werkruimte. Zowel precision_score() en recall_score() uit sklearn.metrics als KFold() en cross_val_score() uit sklearn.model_selection zijn beschikbaar.

Deze oefening maakt deel uit van de cursus

CTR voorspellen met Machine Learning in Python

Cursus bekijken

Oefeninstructies

Stel een K-Fold cross-validatie met vier splits in via n_splits en ken deze toe aan k-fold.
Maak een decision tree-classifier.
Gebruik k_fold om cross-validatie uit te voeren en evalueer de precision en recall van je decision tree-model voor de gegeven max_depth-waarde.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Iterate over different levels of max depth and set up k-fold
for max_depth_val in [3, 5, 10]:
  k_fold = ____(____ = 4, random_state = 0, shuffle = True)
  clf = ____(____ = max_depth_val)
  print("Evaluating Decision Tree for max_depth = %s" %(max_depth_val))
  y_pred = clf.fit(____, ____).predict(____) 
  
  # Calculate precision for cross validation and test
  cv_precision = ____(
    ____, X_train, y_train, cv = k_fold, scoring = 'precision_weighted')
  precision = ____(y_test, y_pred, average = 'weighted')
  print("Cross validation Precision: %s" %(cv_precision))
  print("Test Precision: %s" %(precision))

Code bewerken en uitvoeren

Deze oefening maakt deel uit van de cursus

CTR voorspellen met Machine Learning in Python

SkillTag.level.intermediateSkillTag.label

4.9+

Begin de cursus gratis

Chances are you’re on this page because you clicked a link. In this chapter, you’ll learn why click-through-rates (CTR) are integral to targeted advertising, how to perform basic DataFrame manipulation, and how you can use machine learning models to predict CTR.

Exercise 1: Introduction to click-through rates Exercise 2: Beginning steps Exercise 3: Feature exploration Exercise 4: First evaluation of data Exercise 5: Overview of machine learning models Exercise 6: Logistic regression for breast cancer Exercise 7: Logistic regression for images Exercise 8: A second toy model Exercise 9: CTR prediction using decision trees Exercise 10: Model implementation Exercise 11: A first CTR model Exercise 12: Beyond only accuracy

This chapter provides the foundations for exploratory data analysis (EDA). Using sample data you’ll use the pandas library to look at columns and data types, explore missing data, and use hashing to perform feature engineering on categorical features. All of which are important when exploring features for more accurate CTR prediction.

Exercise 1: Exploratory data analysis Exercise 2: A first look Exercise 3: Checking for missing values Exercise 4: Distributions by CTR Exercise 5: Feature engineering Exercise 6: Analyzing datetime columns Exercise 7: Converting categorical variables Exercise 8: Creating new features Exercise 9: Standardizing features Exercise 10: Log normalization Exercise 11: Understanding standardization Exercise 12: Standard scaling

It’s time to dive deeper. Find out how you can use measures of model performance including precision and recall to answer real-world questions, such as evaluating ROI on ad spend. You’ll also learn ways to improve upon those evaluation metrics, such as ensemble methods and hyperparameter tuning.

Exercise 1: Toepassingen van metriekevaluatie Exercise 2: Vier categorieën uitkomsten Exercise 3: Vier categorieën evalueren Exercise 4: ROI op advertentiebesteding Exercise 5: Modelevaluatie Exercise 6: Precisie en recall Exercise 7: Baseline Exercise 8: Classificatiemodellen vergelijken Exercise 9: Modellen afstemmen Exercise 10: Regularisatie Exercise 11: Cross-validation Exercise 12: Modelselectie

Huidige oefening

Exercise 13: Ensembles en hyperparametertuning Exercise 14: Begrijpen van hyperparameter tuning Exercise 15: Random forests Exercise 16: Grid search

Profits can be heavily impacted by your campaign’s CTR. In this chapter, you’ll learn how deep learning can be used to reduce that risk. You’ll focus on multi-layer perceptron (MLP) and neural network models, and learn how these can be used to capture the complex relationship between variables to more accurately predict CTR. Lastly, you’ll explore how to apply the basics of hyperparameter tuning and regularization to classification models.

Exercise 1: Introduction to deep learning Exercise 2: Understanding MLPs Exercise 3: Beginning model Exercise 4: MLPs for CTR Exercise 5: Hyperparameter tuning in deep learning Exercise 6: Hyperparameter tuning in MLPs Exercise 7: Varying hyperparameters Exercise 8: MLP Grid Search Exercise 9: Model evaluation Exercise 10: F-beta score Exercise 11: Low precision and high AUC Exercise 12: Precision, ROI, and AUC Exercise 13: Model review and comparison Exercise 14: Model comparison warmup Exercise 15: Evaluating precision and ROI Exercise 16: Total scoring Exercise 17: Wrap-up video