Get startedGet started for free

Sugar content of soft drinks

1. Introduction

Hi, I'm Kailash Awati. In this course, I'm going to give you a visually oriented introduction to support vector machines. I'll use the abbreviation SVM for support vector machines henceforth.

2. Preliminaries

The objective of the course is to develop an intuitive understanding of how SVMs work, the different options available in the SVM algorithm and the situations in which they work best. I'll assume you have an intermediate knowledge of R and some experience with visualization using ggplot(). We'll start with a simple one one-dimensional example and build our understanding through datasets of increasing complexity. To keep things simple, we'll stick with binary classification problems: that is, problems that have two classes. OK, let's get started.

3. Sugar content of soft drinks

A soft drink manufacturer has two versions of their flagship brand: a regular version, Choke, with sugar content 11g per 100ml and a reduced sugar offering, Choke-R, with sugar content 8g per 100ml. In practice, though, the sugar content varies quite a bit. Given 25 random samples of Choke and Choke-R, our task is to determine a decision rule to distinguish between the two. Let's see if we can identify such a rule visually.

4. Sugar content of soft drinks - visualization code

The sugar content data has been loaded into the drink_samples dataframe, which contains sugar content measurements on a set of samples. We first specify the data frame and tell ggplot() to plot the sugar content on the x-axis. The y-coordinate is set to zero as there is only one variable. In the subsequent lines, we tell ggplot() to create a scatter plot and do some labeling.

5. Sugar content plot

The plot shows two distinct clusters separated by data points at 8-point-8g per 100ml and 10g per 100ml. These clusters correspond to the two brands. Now, any point lying between these two points would be an acceptable separating boundary between the classes. A separating boundary between classes is called a separator or decision boundary.

6. Decision boundaries

Let's pick two points in the interval - say 9-point-1 and 9-point-7 g per 100ml - as candidate decision boundaries. The decision rules for these are shown on the slide. Let's visualize these.

7. Decision boundaries - visualization code

We create a dataframe with the decision boundaries

8. Decision boundaries - visualization code

and add them to the plot using geom_point(), distinguishing them from the sample points by making them bigger and coloring them red.

9. Plot of decision boundaries

Here's the plot. An important concept is that of the margin, which is the distance between the decision boundary and the closest data point. For example, for the decision boundary at 9-point-1 g per 100ml, the closest point is 8-point-8 g per 100ml, so the margin is 9-point-1 minus 8-point-8, which is 0-point-3. You can figure out the margin for the other decision boundary.

10. Maximum margin separator

Now, the best decision boundary is one that maximizes the margin. This is called the maximal margin boundary or separator. It should be clear that the maximal margin separator lies halfway between the two clusters. That is, at the midpoint of the line joining the sample data points at 8-point-8 and 10 g per 100 ml. Let's add this to our plot. To do this, we create a data frame containing the separator and add it to the plot using geom_point. To distinguish the maximum margin separator from the sample points and previous decision boundaries, we'll make it a bit bigger and color it blue.

11. Plot of maximal margin separator

The plot makes it clear that the blue point is the best decision boundary because it is furthest away from both clusters and therefore, most robust to noise. This simple example serves to illustrate a key feature of SVM algorithms, which is that they find decision boundaries that maximize the margin. Keep this in mind as we work through examples of increasing complexity in this course.

12. Time to practice!

That's it for this lesson. Let's try some examples.