Get startedGet started for free

The RBF Kernel

1. The RBF Kernel

Let's start with a quick recap. Earlier in this chapter we created a dataset with a complex figure of 8 classification boundary and saw that SVMs with polynomial kernels do not do a very good job of classifying this dataset. This motivated the introduction of the exponential radial basis function (or RBF) kernel.

2. RBF Kernel in a nutshell

The RBF kernel is a decreasing function of the distance between two points. It thus serves to simulate the principle of k Nearest Neighbors, namely that the closer two points are to each other in terms of attributes, the more likely it is that they are similar.

3. Influence as a function of distance

The plot shown here illustrates this principle: as one moves away from a point located at the origin, the less its influence.

4. Building an SVM using the RBF kernel

OK, so let's build an SVM using an RBF kernel with default settings for the complex dataset we built in the first lesson of this chapter. As before, I have partitioned the data into training and test sets using the usual 80/20 split. The partitioning process is not shown as it should now be quite familiar. We then calculate the training and test accuracies, which are both around 93%, considerably better than the, 86% we got with a quadratic kernel in the previous lesson. So let's see what the decision boundary looks like.

5. Visualizing the decision boundary

The main thing to note in this figure is that, in contrast to polynomial kernel case, the predicted decision boundary has an hourglass shape that approximates the figure of 8 shape of the actual decision boundary. However, the plot also shows that there is room for improvement. Let's see if we can do better.

6. Refining the decision boundary

We'll refine the decision boundary by tuning gamma and the cost parameters. We'll let gamma vary from 0-point-05 to 500 and cost from 0-point-01 to 100 in powers of 10 at each step. The tuning step can take a while because the algorithm builds a model for every combination of parameters and returns the parameter combination that minimizes the error. This combination is returned in the variable best-dot-parameters. In this case, the best model turns out to be the one with cost equals 1 and gamma equals 5.

7. The tuned model

OK, so we built the tuned model using the best values of cost and gamma. The test accuracy turns out to be 95%, which is just marginally better than the one for the default case, but let's look at the plot.

8. Tuned decision boundary

OK, so the plot clearly shows that the tuned model does a much better job in capturing the actual figure of 8 boundary. This indicates that the tuned model is indeed considerably superior to the default one. In this case we could use this method of checking because we knew the actual decision boundary. In real life situations this will not be the case. What remains true in general is that the local nature of the RBF kernel enables it to capture nuances of a complex boundary, something that is simply not possible via a linear or polynomial kernel.

9. Time to practice!

That brings us to the end of our discussion of RBF kernels and, indeed, this course. To be sure, we have barely scratched the surface of support vector machines, but I do hope that this introduction to this powerful classification method has given you enough to get started and explore further. So that's it from my side, but before you go, let's do a couple of exercises to fix the ideas of RBF kernels and SVMs. And, finally, many thanks for working through the course, I hope you found it useful.