Generating a radially separable dataset
1. Generating a radially separable dataset
In the the previous chapters we created a linearly separable dataset and used it to illustrate the principles behind linear kernel SVMs. In this lesson we'll create a radially separable dataset that we will subsequently use to work with a more complex kernel. Much of what we'll do in this lesson parallels what we did in Chapter 1 where we created a linearly separable dataset.2. Generating a 2d uniformly distributed set of points
We generate a dataset with 200 points, consisting of two predictor variables x1 and x2 uniformly distributed between -1 and 1. From what we did in Chapter 1, we know that this can be easily done using the runif() function. Note that we have to specify the min and max values as they are not the defaults.3. Create a circular boundary
Next, we create a circular boundary with a radius of 0-point-7 units by creating a categorical variable, y, which takes on a value of +1 or -1 depending on whether it lies within or outside the boundary.4. Plot the dataset
As usual, we will use ggplot() to visualize the data. We plot x1 and x2 against the coordinate axes and distinguish the class by color. The idiom should now be quite familiar to you. Let's see what the data looks like.5. Plot of radially separable dataset
OK, here's the plot. As expected the points associated with the -1 class are near the center of the plot with the +1 class points towards the edges. Let's add the boundary to the plot to make the separation between classes visually clearer.6. Adding a circular boundary - Part 1
We need to create a circular boundary. Now ggplot() has no built-in function to generate circles, although the newer ggforce package does. Instead of using ggforce we'll generate a circle ourselves by defining a function to do it. Here's the code. The function returns a dataframe containing npoint points - the default is 100 - that lie on a circle of radius r, centred at x1 center and x2 center. We'll use this function to generate the required boundary.7. Adding a circular boundary - Part 2
To add the boundary to the plot, we first generate the boundary using the function we just created and then add it on to the plot using the geom path() function from ggplot. The last argument to geom path() tells ggplot() that the earlier coordinate settings for the x and y coordinates should be overridden. Alright, let's see what the plot looks like.8. Plot of radially dataset showing boundary
The plot explicitly shows the circular decision boundary. We'll use this dataset to start exploring more complex kernels.9. Time to practice!
But before that, let's do a few exercises.Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.