CommencerCommencer gratuitement

Logistic regression baseline classifier

In the last 2 lessons, you learned how valuable feature selection is in the context of machine learning interviews. Another set of common questions you should expect in a machine learning interview pertain to feature engineering, and how they help improve model performance.

In this exercise, you'll engineer a new feature on the loan_data dataset from Chapter 1, compare the accuracy score of Logistic Regression models on the dataset before and after feature engineering by comparing test labels with the predicted values of the target variable Loan Status.

All relevant packages have been imported for you: matplotlib.pyplot as plt, seaborn as sns, LogisticRegression from sklearn.linear_model, train_test_split from sklearn.model_selection, and accuracy_score from sklearn.metrics.

Feature engineering is considered a pre-processing step before modeling: Machine learning pipeline

Cet exercice fait partie du cours

Practicing Machine Learning Interview Questions in Python

Afficher le cours

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Create X matrix and y array
X = loan_data.____("____", axis=1)
y = loan_data["____"]

# Train/test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=123)

# Instantiate
logistic = ____()

# Fit
logistic.____(____, ____)

# Predict and print accuracy
print(____(y_true=____, y_pred=logistic.____(____)))
Modifier et exécuter le code