Choosing the best model
In this exercise, you'll compare different classifiers and choose the one that performs the best.
The dataset here - already loaded and split into train and test sets - consists of Pokémon - their stats, types, and whether or not they're legendary. The objective of our classifiers is to predict this 'Legendary'
variable.
Three individual classifiers have been fitted to the training set:
clf_lr
is a logistic regression.clf_dt
is a decision tree.clf_knn
is a 5-nearest neighbors classifier.
As the classes here are imbalanced - only 65 of the 800 Pokémon in the dataset are legendary - we'll use F1-Score to evaluate the performance. Scikit-learn's f1_score()
has been imported for you.
This exercise is part of the course
Ensemble Methods in Python
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Predict the labels of the test set
pred_lr = ____
pred_dt = ____
pred_knn = ____