Effect of removing examples
Support vectors are defined as training examples that influence the decision boundary. In this exercise, you'll observe this behavior by removing non support vectors from the training set.
The wine quality dataset is already loaded into X
and y
(first two features only). (Note: we specify lims
in plot_classifier()
so that the two plots are forced to use the same axis limits and can be compared directly.)
This exercise is part of the course
Linear Classifiers in Python
Exercise instructions
- Train a linear SVM on the whole data set.
- Create a new data set containing only the support vectors.
- Train a new linear SVM on the smaller data set.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Train a linear SVM
svm = SVC(kernel="linear")
svm.fit(____)
plot_classifier(X, y, svm, lims=(11,15,0,6))
# Make a new data set keeping only the support vectors
print("Number of original examples", len(X))
print("Number of support vectors", len(svm.support_))
X_small = X[____]
y_small = y[____]
# Train a new SVM using only the support vectors
svm_small = SVC(kernel="linear")
svm_small.fit(____)
plot_classifier(X_small, y_small, svm_small, lims=(11,15,0,6))