Mushrooms: a matter of life or death
Let's conclude the course by revisiting the mushroom edibility problem. You'll try the stacking classifier to see if the score can be improved. As stacking uses a meta-estimator (second layer classifier) which attempts to correct predictions from the first layer, some of the misclassified instances could be corrected. This is a very important problem, as the edibility of a mushroom is a matter of life or death.
The dataset has been loaded and split into train and test sets. Do you think stacking can help to predict the edibility of a mushroom with greater confidence?
This exercise is part of the course
Ensemble Methods in Python
Exercise instructions
- Instantiate the first-layer estimators: a 5-nearest neighbors using the ball tree algorithm, a decision tree classifier with parameters
min_samples_leaf = 5
andmin_samples_split = 15
, and a Gaussian Naive Bayes classifier. - Build and fit a stacking classifier, using the parameters
classifiers
- a list containing the first-layer classifiers - andmeta_classifier
- the default logistic regression.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create the first-layer models
clf_knn = ____
clf_dt = ____(____, ____, random_state=500)
clf_nb = ____
# Create the second-layer model (meta-model)
clf_lr = LogisticRegression()
# Create and fit the stacked model
clf_stack = ____
clf_stack.fit(X_train, y_train)
# Evaluate the stacked model’s performance
print("Accuracy: {:0.4f}".format(accuracy_score(y_test, clf_stack.predict(X_test))))