Een gegevensset met suikergehalte visualiseren

In deze oefening maak je een eendimensionale spreidingsplot van 25 metingen van het suikergehalte in frisdrank. Het doel is om afzonderlijke clusters in de gegevensset te visualiseren als eerste stap richting het vinden van mogelijke beslissingsgrenzen.

De gegevensset met 25 suikergehaltemetingen staat in de kolom sugar_content van de data frame df, die al voor je is ingeladen.

Deze oefening maakt deel uit van de cursus

Support Vector Machines in R

Cursus bekijken

Oefeninstructies

Laad het package ggplot2.
Toon de variabelen in data frame df.
Maak de code voor de spreidingsplot af. Gebruik de gegevensset df en zet het suikergehalte van de monsters op de x-as (bij y gelijk aan nul).
Schrijf ggplot()-code om het suikergehalte in df als een spreidingsplot weer te geven. Kun je twee duidelijke clusters zien die overeenkomen met monsters met hoog en laag suikergehalte?

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Load ggplot2
___

# Print variable names
___

# Plot sugar content along the x-axis
plot_df <- ggplot(data = __, aes(x = ___, y = ___)) + 
    geom_point() + 
    geom_text(aes(label = sugar_content), size = 2.5, vjust = 2, hjust = 0.5)

# Display plot
plot_df

Code bewerken en uitvoeren

Deze oefening maakt deel uit van de cursus

Support Vector Machines in R

SkillTag.level.intermediateSkillTag.label

4.9+

Begin de cursus gratis

This chapter introduces some key concepts of support vector machines through a simple 1-dimensional example. Students are also walked through the creation of a linearly separable dataset that is used in the subsequent chapter.

Exercise 1: Suikergehalte van frisdranken Exercise 2: Een gegevensset met suikergehalte visualiseren

Huidige oefening

Exercise 3: Beslissingsgrenzen herkennen Exercise 4: Zoek de scheider met maximale marge Exercise 5: Visualiseer de maximalemarge-scheider Exercise 6: Een lineair te scheiden gegevensset genereren Exercise 7: Genereer een 2D uniform verdeelde gegevensset.Exercise 8: Maak een beslissingsgrens Exercise 9: Voeg een marge toe aan de gegevensset

Introduces students to the basic concepts of support vector machines by applying the svm algorithm to a dataset that is linearly separable. Key concepts are illustrated through ggplot visualisations that are built from the outputs of the algorithm and the role of the cost parameter is highlighted via a simple example. The chapter closes with a section on how the algorithm deals with multiclass problems.

Exercise 1: Linear Support Vector Machines Exercise 2: Creating training and test datasets Exercise 3: Building a linear SVM classifier Exercise 4: Exploring the model and calculating accuracy Exercise 5: Visualizing Linear SVMs Exercise 6: Visualizing support vectors using ggplot Exercise 7: Visualizing decision & margin bounds using `ggplot2`Exercise 8: Visualizing decision & margin bounds using `plot()`Exercise 9: Tuning linear SVMs Exercise 10: Tuning a linear SVM Exercise 11: Visualizing decision boundaries and margins Exercise 12: When are soft margin classifiers useful?Exercise 13: Multiclass problems Exercise 14: A multiclass classification problem Exercise 15: Iris redux - a more robust accuracy.

Provides an introduction to polynomial kernels via a dataset that is radially separable (i.e. has a circular decision boundary). After demonstrating the inadequacy of linear kernels for this dataset, students will see how a simple transformation renders the problem linearly separable thus motivating an intuitive discussion of the kernel trick. Students will then apply the polynomial kernel to the dataset and tune the resulting classifier.

Exercise 1: Generating a radially separable dataset Exercise 2: Generating a 2d radially separable dataset Exercise 3: Visualizing the dataset Exercise 4: Linear SVMs on radially separable data Exercise 5: Linear SVM for a radially separable dataset Exercise 6: Average accuracy for linear SVM Exercise 7: The kernel trick Exercise 8: Visualizing transformed radially separable data Exercise 9: SVM with polynomial kernel Exercise 10: Tuning SVMs Exercise 11: Using `tune.svm()`Exercise 12: Building and visualizing the tuned model

Builds on the previous three chapters by introducing the highly flexible Radial Basis Function (RBF) kernel. Students will create a "complex" dataset that shows up the limitations of polynomial kernels. Then, following an intuitive motivation for the RBF kernel, students see how it addresses the shortcomings of the other kernels discussed in this course.

Exercise 1: Generating a complex dataset Exercise 2: Generating a complex dataset - part 1 Exercise 3: Generating a complex dataset - part 2 Exercise 4: Visualizing the dataset Exercise 5: Motivating the RBF kernel Exercise 6: Linear SVM for complex dataset Exercise 7: Quadratic SVM for complex dataset Exercise 8: The RBF Kernel Exercise 9: Polynomial SVM on a complex dataset Exercise 10: RBF SVM on a complex dataset Exercise 11: Tuning an RBF kernel SVM