NYC SAT Scores Data Viz
In the last lesson, when discussing Latin Squares, we did numerical EDA in the form of looking at means, variances, and medians of the math SAT scores. Another crucial part of the EDA is data visualization, as it often helps in spotting outliers plus gives you a visual representation of the distribution of your variables.
ggplot2
has been loaded for you and the nyc_scores
dataset is available. Create and examine the requested boxplot. How do the medians differ by Borough? How many outliers are present, and where are they mostly present?
This exercise is part of the course
Experimental Design in R
Exercise instructions
- Create a boxplot of Math SAT scores by
Borough
. - Run the code to include a title:
"Average SAT Math Scores by Borough, NYC"
. - Change the x- and y-axis labels to read
"Borough (NYC)"
and"Average SAT Math Scores (2014-15)"
, respectively, using the correct arguments tolabs()
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create a boxplot of Math scores by Borough, with a title and x/y axis labels
ggplot(___) +
___ +
labs(title = "Average SAT Math Scores by Borough, NYC",
___,
___)