Pickles
Finally, it is time for you to push your first model to production. It is a random forest classifier which you will use as a baseline, while you are still working to develop a better alternative. You have access to the data split in training test with their usual names, X_train, X_test, y_train and y_test, as well as to the modules RandomForestClassifier() and pickle, whose methods .load() and .dump() you will need for this exercise.
Diese Übung ist Teil des Kurses
Designing Machine Learning Workflows in Python
Anleitung zur Übung
- Fit a random forest classifier to the data. Fix the random seed to 42 ensure that your results are reproducible.
- Write the model to file using pickle. Open the destination file using the
with open(____) as ____syntax. - Now load the model from file into a different variable name,
clf_from_file. - Store the predictions from the model you loaded into a variable
preds.
Interaktive Übung
Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.
# Fit a random forest to the training set
clf = ____(____=42).____(
X_train, y_train)
# Save it to a file, to be pushed to production
with ____('model.pkl', ____) as ____:
pickle.____(clf, file=file)
# Now load the model from file in the production environment
with ____ as file:
clf_from_file = pickle.____(file)
# Predict the labels of the test dataset
preds = clf_from_file.____