Tuning linear SVMs

1. Tuning linear SVMs

In the previous lesson we learned how to build a linear SVM. There we used the default value of 1 for the cost. This resulted in a soft margin classifier, that is, one in which the margin is wide. In this lesson, we will learn how to tune the margin for SVMs by tweaking the cost parameter. We will also learn about hard and soft margin SVMs, and where they are useful.

2. Linear SVM, default cost

As a reminder, here's the code to create the default cost linear SVM for our linearly separable dataset. Note the large number of support vectors: 55 in all. Let's remind ourselves of what this solution looks like.

3. Visualizing boundaries & margins (recap)

Here's the plot that we created in the previous lesson. The main things to note are that: a) the margin is wide and b) a large number of points lie within the margin boundaries. Let's now see what happens if we increase the cost to 100.

4. Linear SVM with cost = 100

On increasing the cost to 100, we see the number of support vectors is drastically reduced from 55 to 6. Let's see what the decision boundary and margin look like.

5. Decision and margin boundaries (cost == 100)

Here's the plot for the cost equals 100 case. The key points to note are that a) the margin is much narrower than for the cost equals 1 case and b) the number of margin violations is virtually zero.The narrow margin assures us that the slope and intercept of the decision boundary are close to their correct values of 1 and 0, respectively. You can check that this is so using the material presented in the previous lesson.

6. Implication

The implication is that it can be useful to reduce the margin boundary when the shape of the decision boundary is known to be linear. However, this is rarely the case in real life, so let's now look at a non-linearly separable situation.

7. A nonlinearly separable dataset

The dataset shown here is not linearly separable as is evident from the misclassified red and blue points that are on the wrong side of the linear boundary. Let's build two linear SVMs for this dataset: one with cost equals 100 and the other with the cost equals 1. Let's look at the higher cost case first.

8. Nonlinear dataset, linear SVM (cost = 100)

We build a linear SVM in the usual way, using a training dataset composed of 80% of the data. Then we calculate the training and test accuracy of the model. The test accuracy is 85% for this particular train/test split. I repeated this for 50 random train/test splits and got an average test accuracy of 82-point-9%. OK, so let's see what the solution looks like.

9. Cost = 100 solution, nonlinear dataset

On plotting the decision and margin boundaries, we see that the margin is wide despite the high cost. This suggests that the true decision boundary is not linear (and we know that it isn't!). However, the misclassified points outside the margin boundaries hint that we may be able to get a better accuracy by widening the margin even further. Let's do this by reducing the cost.

10. Nonlinear dataset, linear SVM (cost = 1)

We rebuild the model setting cost equals 1. The test accuracy increases by about 1-point-5%. On repeating this for 50 random train/test splits, I got an average test accuracy of 83-point-7% an improvement of about 0-point-8%. The improvement is, excuse the pun, marginal, but tangible.

11. Cost = 1 solution, nonlinear dataset

The plot shows that the margin has widened and the misclassified points that lay outside the margin for the cost equals 100 case are now almost all within the margin boundaries. This assures us that the true decision boundary, whatever its shape, will lie within the margin.

12. Time to practice!

That brings us to the end of the lesson. Let's try some exercises.

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.