Automatiser le choix des hyperparamètres

Trouver le meilleur hyperparamètre sans écrire des centaines de lignes de code pour des centaines de modèles est un gain d’efficacité précieux qui vous aidera grandement à construire vos futurs modèles de Machine Learning.

Un hyperparamètre important pour l’algorithme GBM est le taux d’apprentissage (learning rate). Mais lequel convient le mieux à ce problème ? En écrivant une boucle pour tester plusieurs valeurs, en rassemblant les résultats et en les visualisant, vous pouvez identifier la meilleure.

Les taux d’apprentissage possibles à tester sont 0.001, 0.01, 0.05, 0.1, 0.2 et 0.5

Vous disposez des jeux de données X_train, X_test, y_train et y_test, et GradientBoostingClassifier a été importé pour vous.

Cet exercice fait partie du cours

<cours>Optimisation des hyperparamètres en Python</cours>

Instructions de l’exercice

Créez une liste learning_rates pour les taux d’apprentissage, et une results_list pour stocker l’accuracy de vos prédictions.
Écrivez une boucle pour créer un modèle GBM pour chaque taux d’apprentissage mentionné et générez des prédictions pour chaque modèle.
Enregistrez le taux d’apprentissage et la métrique d’accuracy dans results_list.
Convertissez la liste de résultats en DataFrame et affichez-la.

Exercice interactif pratique

Essayez cet exercice en complétant ce code d’exemple.

# Set the learning rates & results storage
learning_rates = ____
results_list = ____

# Create the for loop to evaluate model predictions for each learning rate
for learning_rate in ____:
    model = ____(learning_rate=____)
    predictions = ____.fit(____, ____).predict(____)
    # Save the learning rate and accuracy score
    results_list.append([____, accuracy_score(y_test, ____)])

# Gather everything into a DataFrame
results_df = pd.DataFrame(____, columns=['learning_rate', 'accuracy'])
print(results_df)

Modifier et exécuter le code

Cet exercice fait partie du cours

<cours>Optimisation des hyperparamètres en Python</cours>

IntermédiaireNiveau de compétence

4.9+

Commencer le cours gratuitement

In this introductory chapter you will learn the difference between hyperparameters and parameters. You will practice extracting and analyzing parameters, setting hyperparameter values for several popular machine learning algorithms. Along the way you will learn some best practice tips & tricks for choosing which hyperparameters to tune and what values to set & build learning curves to analyze your hyperparameter choices.

Exercise 1: Introduction et « Paramètres »Exercise 2: Paramètres dans la régression logistique Exercise 3: Extraire un paramètre de régression logistique Exercise 4: Extraire un paramètre d’un Random Forest Exercise 5: Introduction aux hyperparamètres Exercise 6: Hyperparamètres dans les Random Forests Exercise 7: Explorer les hyperparamètres de Random Forest Exercise 8: Hyperparamètres de KNN Exercise 9: Définir et analyser les valeurs d’hyperparamètres Exercise 10: Automatiser le choix des hyperparamètres

Exercice actuel

Exercise 11: Construire des courbes d’apprentissage

This chapter introduces you to a popular automated hyperparameter tuning methodology called Grid Search. You will learn what it is, how it works and practice undertaking a Grid Search using Scikit Learn. You will then learn how to analyze the output of a Grid Search & gain practical experience doing this.

Exercise 1: Introducing Grid Search Exercise 2: Build Grid Search functions Exercise 3: Iteratively tune multiple hyperparameters Exercise 4: How Many Models?Exercise 5: Grid Search with Scikit Learn Exercise 6: GridSearchCV inputs Exercise 7: GridSearchCV with Scikit Learn Exercise 8: Understanding a grid search output Exercise 9: Using the best outputs Exercise 10: Exploring the grid search results Exercise 11: Analyzing the best results Exercise 12: Using the best results

In this chapter you will be introduced to another popular automated hyperparameter tuning methodology called Random Search. You will learn what it is, how it works and importantly how it differs from grid search. You will learn some advantages and disadvantages of this method and when to choose this method compared to Grid Search. You will practice undertaking a Random Search with Scikit Learn as well as visualizing & interpreting the output.

Exercise 1: Introducing Random Search Exercise 2: Randomly Sample Hyperparameters Exercise 3: Randomly Search with Random Forest Exercise 4: Visualizing a Random Search Exercise 5: Random Search in Scikit Learn Exercise 6: RandomSearchCV inputs Exercise 7: The RandomizedSearchCV Object Exercise 8: RandomSearchCV in Scikit Learn Exercise 9: Comparing Grid and Random Search Exercise 10: Comparing Random & Grid Search Exercise 11: Grid and Random Search Side by Side

In this final chapter you will be given a taste of more advanced hyperparameter tuning methodologies known as ''informed search''. This includes a methodology known as Coarse To Fine as well as Bayesian & Genetic hyperparameter tuning algorithms. You will learn how informed search differs from uninformed search and gain practical skills with each of the mentioned methodologies, comparing and contrasting them as you go.

Exercise 1: Informed Search: Coarse to Fine Exercise 2: Visualizing Coarse to Fine Exercise 3: Coarse to Fine Iterations Exercise 4: Informed Search: Bayesian Statistics Exercise 5: Bayes Rule in Python Exercise 6: Bayesian Hyperparameter tuning with Hyperopt Exercise 7: Informed Search: Genetic Algorithms Exercise 8: Genetic Hyperparameter Tuning with TPOT Exercise 9: Analysing TPOT's stability Exercise 10: Congratulations!