Effect of removing examples

Support vectors are defined as training examples that influence the decision boundary. In this exercise, you'll observe this behavior by removing non support vectors from the training set.

The wine quality dataset is already loaded into X and y (first two features only). (Note: we specify lims in plot_classifier() so that the two plots are forced to use the same axis limits and can be compared directly.)

This exercise is part of the course

Linear Classifiers in Python

View Course

Exercise instructions

  • Train a linear SVM on the whole data set.
  • Create a new data set containing only the support vectors.
  • Train a new linear SVM on the smaller data set.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Train a linear SVM
svm = SVC(kernel="linear")
svm.fit(____)
plot_classifier(X, y, svm, lims=(11,15,0,6))

# Make a new data set keeping only the support vectors
print("Number of original examples", len(X))
print("Number of support vectors", len(svm.support_))
X_small = X[____]
y_small = y[____]

# Train a new SVM using only the support vectors
svm_small = SVC(kernel="linear")
svm_small.fit(____)
plot_classifier(X_small, y_small, svm_small, lims=(11,15,0,6))