Random forests

Random Forests are a classic and powerful ensemble method that utilize individual decision trees via bootstrap aggregation (or bagging for short). Two main hyperparameters involved in this type of model are the number of trees, and the max depth of each tree. In this exercise, you will implement and evaluate a simple random forest classifier with some fixed hyperparameter values.

X_train, y_train, X_test, y_test are available in your workspace. pandas as pd, numpy as np, and sklearn are also available in your workspace. RandomForestClassifier() from sklearn.ensemble is available as well, along with roc_curve() and auc() from sklearn.metrics.

Questo esercizio fa parte del corso

Predicting CTR with Machine Learning in Python

Visualizza il corso

Istruzioni dell'esercizio

Create a random forest classifier with 50 trees, and a max depth of 5.
Train the classifier and get probability scores via .predict_proba(), and predictions via .predict() for the testing data.
Evaluate the AUC of the ROC curve for the classifier using first roc_curve() to calculate fpr and tpr, and then auc() on the result.
Evaluate the precision and recall for the classifier.

Esercizio pratico interattivo

Prova a risolvere questo esercizio completando il codice di esempio.

# Create random forest classifier with specified params
clf = ____(____ = 50, ____ = 5)

# Train classifier - predict probability score and label
y_score = clf.____(X_train, y_train).____(X_test) 
y_pred = clf.____(X_train, y_train).____(X_test) 

# Get ROC curve metrics
fpr, tpr, thresholds = ____(y_test, y_score[:, 1])
print("ROC of AUC: %s"%(____(fpr, tpr)))

# Get precision and recall
precision = ____(y_test, y_pred, average = 'weighted')
recall = ____(y_test, y_pred, average = 'weighted')
print("Precision: %s, Recall: %s" %(precision, recall))

Modifica ed esegui il codice

Questo esercizio fa parte del corso

Predicting CTR with Machine Learning in Python

IntermediárioNível de habilidade

5.0+

Inizia il corso gratis

Chances are you’re on this page because you clicked a link. In this chapter, you’ll learn why click-through-rates (CTR) are integral to targeted advertising, how to perform basic DataFrame manipulation, and how you can use machine learning models to predict CTR.

Exercise 1: Introduction to click-through rates Exercise 2: Beginning steps Exercise 3: Feature exploration Exercise 4: First evaluation of data Exercise 5: Overview of machine learning models Exercise 6: Logistic regression for breast cancer Exercise 7: Logistic regression for images Exercise 8: A second toy model Exercise 9: CTR prediction using decision trees Exercise 10: Model implementation Exercise 11: A first CTR model Exercise 12: Beyond only accuracy

This chapter provides the foundations for exploratory data analysis (EDA). Using sample data you’ll use the pandas library to look at columns and data types, explore missing data, and use hashing to perform feature engineering on categorical features. All of which are important when exploring features for more accurate CTR prediction.

Exercise 1: Exploratory data analysis Exercise 2: A first look Exercise 3: Checking for missing values Exercise 4: Distributions by CTR Exercise 5: Feature engineering Exercise 6: Analyzing datetime columns Exercise 7: Converting categorical variables Exercise 8: Creating new features Exercise 9: Standardizing features Exercise 10: Log normalization Exercise 11: Understanding standardization Exercise 12: Standard scaling

It’s time to dive deeper. Find out how you can use measures of model performance including precision and recall to answer real-world questions, such as evaluating ROI on ad spend. You’ll also learn ways to improve upon those evaluation metrics, such as ensemble methods and hyperparameter tuning.

Exercise 1: Applications of metric evaluation Exercise 2: Four categories of outcomes Exercise 3: Evaluating four categories Exercise 4: ROI on ad spend Exercise 5: Model evaluation Exercise 6: Precision and recall Exercise 7: Baseline Exercise 8: Classifier comparison Exercise 9: Tuning models Exercise 10: Regularization Exercise 11: Cross validation Exercise 12: Model selection Exercise 13: Ensembles and hyperparameter tuning Exercise 14: Understanding hyperparameter tuning Exercise 15: Random forests

Esercizio in corso

Exercise 16: Grid search

Profits can be heavily impacted by your campaign’s CTR. In this chapter, you’ll learn how deep learning can be used to reduce that risk. You’ll focus on multi-layer perceptron (MLP) and neural network models, and learn how these can be used to capture the complex relationship between variables to more accurately predict CTR. Lastly, you’ll explore how to apply the basics of hyperparameter tuning and regularization to classification models.

Exercise 1: Introduction to deep learning Exercise 2: Understanding MLPs Exercise 3: Beginning model Exercise 4: MLPs for CTR Exercise 5: Hyperparameter tuning in deep learning Exercise 6: Hyperparameter tuning in MLPs Exercise 7: Varying hyperparameters Exercise 8: MLP Grid Search Exercise 9: Model evaluation Exercise 10: F-beta score Exercise 11: Low precision and high AUC Exercise 12: Precision, ROI, and AUC Exercise 13: Model review and comparison Exercise 14: Model comparison warmup Exercise 15: Evaluating precision and ROI Exercise 16: Total scoring Exercise 17: Wrap-up video