Visualizing a sugar content dataset
In this exercise, you will create a 1-dimensional scatter plot of 25 soft drink sugar content measurements. The aim is to visualize distinct clusters in the dataset as a first step towards identifying candidate decision boundaries.
The dataset with 25 sugar content measurements is stored in the sugar_content
column of the data frame df
, which has been preloaded for you.
This exercise is part of the course
Support Vector Machines in R
Exercise instructions
- Load the
ggplot2
package. - List the variables in dataframe
df
. - Complete the scatter plot code. Using the
df
dataset, plot the sugar content of samples along the x-axis (at y equal to zero). - Write
ggplot()
code to display sugar content indf
as a scatter plot. Can you spot two distinct clusters corresponding to high and low sugar content samples?
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Load ggplot2
___
# Print variable names
___
# Plot sugar content along the x-axis
plot_df <- ggplot(data = __, aes(x = ___, y = ___)) +
geom_point() +
geom_text(aes(label = sugar_content), size = 2.5, vjust = 2, hjust = 0.5)
# Display plot
plot_df