Exercise

# Identify optimal L1 penalty coefficient

You will now tune the `C`

parameter for the L1 regularization to discover the one which reduces model complexity while still maintaining good model performance metrics. You will run a `for`

loop through possible `C`

values and build logistic regression instances on each, as well as calculate performance metrics.

A list `C`

has been created with the possible values. The `l1_metrics`

array has been built with 3 columns, with the first being the `C`

values, and the next two being placeholders for non-zero coefficient counts and the recall score of the model. The scaled features and target variables have been loaded as `train_X`

, `train_Y`

for training, and `test_X`

, `test_Y`

for testing.

Both `numpy`

and `pandas`

are loaded as `np`

and `pd`

as well as the `recall_score`

function from `sklearn`

.

Instructions

**100 XP**

- Run a
`for`

loop over the range from 0 to the length of the list`C`

. - For each
`C`

candidate, initialize and fit a Logistic Regression and predict churn on test data. - For each
`C`

candidate, store the non-zero coefficients and the recall score in the second and third columns of`l1_metrics`

. - Create a
`pandas`

DataFrame out of`l1_metrics`

with the appropriate column names.