CommencerCommencer gratuitement

Pickles

Finally, it is time for you to push your first model to production. It is a random forest classifier which you will use as a baseline, while you are still working to develop a better alternative. You have access to the data split in training test with their usual names, X_train, X_test, y_train and y_test, as well as to the modules RandomForestClassifier() and pickle, whose methods .load() and .dump() you will need for this exercise.

Cet exercice fait partie du cours

Designing Machine Learning Workflows in Python

Afficher le cours

Instructions

  • Fit a random forest classifier to the data. Fix the random seed to 42 ensure that your results are reproducible.
  • Write the model to file using pickle. Open the destination file using the with open(____) as ____ syntax.
  • Now load the model from file into a different variable name, clf_from_file.
  • Store the predictions from the model you loaded into a variable preds.

Exercice interactif pratique

Essayez cet exercice en complétant cet exemple de code.

# Fit a random forest to the training set
clf = ____(____=42).____(
  X_train, y_train)

# Save it to a file, to be pushed to production
with ____('model.pkl', ____) as ____:
    pickle.____(clf, file=file)

# Now load the model from file in the production environment
with ____ as file:
    clf_from_file = pickle.____(file)

# Predict the labels of the test dataset
preds = clf_from_file.____
Modifier et exécuter le code