Extraire un paramètre de régression logistique

Vous allez maintenant vous entraîner à extraire un paramètre important du modèle de régression logistique. La régression logistique comporte d’autres paramètres que vous n’explorerez pas ici, mais vous pouvez les consulter dans la documentation de scikit-learn.org pour le module LogisticRegression() sous « Attributes ».

Ce paramètre est essentiel pour comprendre le sens et l’ampleur de l’effet des variables sur la cible.

Dans cet exercice, nous allons extraire le paramètre des coefficients (présent dans l’attribut coef_), l’associer aux noms de colonnes d’origine, puis identifier les variables ayant l’effet positif le plus fort sur la variable cible.

Vous avez à disposition :

Un objet modèle de régression logistique nommé log_reg_clf
Le DataFrame X_train

sklearn et pandas ont été importés pour vous.

Cet exercice fait partie du cours

<cours>Optimisation des hyperparamètres en Python</cours>

Instructions de l’exercice

Créez une liste des noms de colonnes d’origine utilisés dans le DataFrame d’entraînement.
Extrayez les coefficients de l’estimateur de régression logistique.
Créez un DataFrame contenant les coefficients et les noms de variables, puis affichez-le.
Affichez les 3 variables « positives » principales en fonction de la valeur du coefficient.

Exercice interactif pratique

Essayez cet exercice en complétant ce code d’exemple.

# Create a list of original variable names from the training DataFrame
original_variables = ____

# Extract the coefficients of the logistic regression estimator
model_coefficients = ____.____[____]

# Create a dataframe of the variables and coefficients & print it out
coefficient_df = pd.DataFrame({"Variable" : ____, "Coefficient": ____})
print(coefficient_df)

# Print out the top 3 positive variables
top_three_df = coefficient_df.sort_values(by=____, axis=0, ascending=____)[0:____]
print(top_three_df)

Modifier et exécuter le code

Cet exercice fait partie du cours

<cours>Optimisation des hyperparamètres en Python</cours>

IntermédiaireNiveau de compétence

4.9+

Commencer le cours gratuitement

In this introductory chapter you will learn the difference between hyperparameters and parameters. You will practice extracting and analyzing parameters, setting hyperparameter values for several popular machine learning algorithms. Along the way you will learn some best practice tips & tricks for choosing which hyperparameters to tune and what values to set & build learning curves to analyze your hyperparameter choices.

Exercise 1: Introduction et « Paramètres »Exercise 2: Paramètres dans la régression logistique Exercise 3: Extraire un paramètre de régression logistique

Exercice actuel

Exercise 4: Extraire un paramètre d’un Random Forest Exercise 5: Introduction aux hyperparamètres Exercise 6: Hyperparamètres dans les Random Forests Exercise 7: Explorer les hyperparamètres de Random Forest Exercise 8: Hyperparamètres de KNN Exercise 9: Définir et analyser les valeurs d’hyperparamètres Exercise 10: Automatiser le choix des hyperparamètres Exercise 11: Construire des courbes d’apprentissage

This chapter introduces you to a popular automated hyperparameter tuning methodology called Grid Search. You will learn what it is, how it works and practice undertaking a Grid Search using Scikit Learn. You will then learn how to analyze the output of a Grid Search & gain practical experience doing this.

Exercise 1: Introducing Grid Search Exercise 2: Build Grid Search functions Exercise 3: Iteratively tune multiple hyperparameters Exercise 4: How Many Models?Exercise 5: Grid Search with Scikit Learn Exercise 6: GridSearchCV inputs Exercise 7: GridSearchCV with Scikit Learn Exercise 8: Understanding a grid search output Exercise 9: Using the best outputs Exercise 10: Exploring the grid search results Exercise 11: Analyzing the best results Exercise 12: Using the best results

In this chapter you will be introduced to another popular automated hyperparameter tuning methodology called Random Search. You will learn what it is, how it works and importantly how it differs from grid search. You will learn some advantages and disadvantages of this method and when to choose this method compared to Grid Search. You will practice undertaking a Random Search with Scikit Learn as well as visualizing & interpreting the output.

Exercise 1: Introducing Random Search Exercise 2: Randomly Sample Hyperparameters Exercise 3: Randomly Search with Random Forest Exercise 4: Visualizing a Random Search Exercise 5: Random Search in Scikit Learn Exercise 6: RandomSearchCV inputs Exercise 7: The RandomizedSearchCV Object Exercise 8: RandomSearchCV in Scikit Learn Exercise 9: Comparing Grid and Random Search Exercise 10: Comparing Random & Grid Search Exercise 11: Grid and Random Search Side by Side

In this final chapter you will be given a taste of more advanced hyperparameter tuning methodologies known as ''informed search''. This includes a methodology known as Coarse To Fine as well as Bayesian & Genetic hyperparameter tuning algorithms. You will learn how informed search differs from uninformed search and gain practical skills with each of the mentioned methodologies, comparing and contrasting them as you go.

Exercise 1: Informed Search: Coarse to Fine Exercise 2: Visualizing Coarse to Fine Exercise 3: Coarse to Fine Iterations Exercise 4: Informed Search: Bayesian Statistics Exercise 5: Bayes Rule in Python Exercise 6: Bayesian Hyperparameter tuning with Hyperopt Exercise 7: Informed Search: Genetic Algorithms Exercise 8: Genetic Hyperparameter Tuning with TPOT Exercise 9: Analysing TPOT's stability Exercise 10: Congratulations!