Exercise

# GridSearchCV to find optimal parameters

In this exercise you're going to **tweak our model in a less "random" way**, but use `GridSearchCV`

to do the work for you.

With `GridSearchCV`

you can define **which performance metric to score** the options on. Since for fraud detection we are mostly interested in catching as many fraud cases as possible, you can optimize your model settings to get the best possible Recall score. If you also cared about reducing the number of false positives, you could optimize on F1-score, this gives you that nice Precision-Recall trade-off.

`GridSearchCV`

has already been imported from `sklearn.model_selection`

, so let's give it a try!

Instructions

**100 XP**

- Define in the parameter grid that you want to try 1 and 30 trees, and that you want to try the
`gini`

and`entropy`

split criterion. - Define the model to be simple RandomForestClassifier, you want to keep the random_state at 5 to be able to compare models.
- Set the
`scoring`

option such that it optimizes for recall. - Fit the model to the training data
`X_train`

and`y_train`

and obtain the best parameters for the model.