Get startedGet started for free

Linear SVMs on radially separable data

1. Linear SVMs on radially separable data

In this lesson, we'll attempt to use linear SVMs to classify the radially separable data we generated in the previous lesson. We know this will not work very well, but it is a good way to reinforce what we have learned so far and take our first steps towards learning how to handle complex decision boundaries.

2. Linear SVM, cost = 1

The model building process should now be familiar: we partition the data into training and test sets using the usual 80/20 split. I'm not showing this here as we have done it many times in previous lessons. However, in case you want to repeat the calculations, note that the random number seed used is 10. We then build a default cost linear SVM using the training dataset. Note the large number of support vectors (more than 60% of the dataset). This tells us the classifier is not very good, a point that's confirmed by less than stellar accuracy. We also generate a plot using the svm plot() function, which I'll show you in the next slide.

3. Plot: Linear SVM, default cost

So the plot highlights just how badly the linear classifier does; all points in the training set end up with a classification of 1. This may be an artifact of the particular train/test split, but let's see if we can do better by reducing the margin. As you may recall, we can reduce the margin by increasing cost.

4. Linear SVM, cost = 100

OK, on increasing the cost to 100, one sees that the number of support vectors increases and there is virtually no change in the accuracy from the cost equals 1 case.

5. Plot: Linear SVM, cost = 100

The plot of the cost=100 classifier clearly shows that there is no change from the cost=1 case: the model assigns a classification of 1 to all trainset points. Now as I mentioned earlier, this particular result could well be due to the particular train/test split that we have used.

6. A better estimate of accuracy

So, to get a good estimate of accuracy, we should calculate the average accuracy over a large number of independent train/test splits and check the standard deviation of the result to get an idea of variability. We'll do this next. Nevertheless, this particular example indicates that linear classifiers are unlikely to work well for this dataset.

7. Average accuracy for default cost SVM

OK, so here we calculate the accuracy for 100 different train test splits and calculate the average accuracy and standard deviation.

8. How well does a linear SVM perform?

This means that the linear classifier does just a little better than a coin toss. In the next chapter, we'll use our knowledge of the boundary to do much better, which will then lead us to a powerful generalization that can be used to tackle complex, even disjoint, decision boundaries.

9. Time to practice!

But first, some exercises.