Get startedGet started for free

NYC SAT Scores Data Viz

In the last lesson, when discussing Latin Squares, we did numerical EDA in the form of looking at means, variances, and medians of the math SAT scores. Another crucial part of the EDA is data visualization, as it often helps in spotting outliers plus gives you a visual representation of the distribution of your variables.

ggplot2 has been loaded for you and the nyc_scores dataset is available. Create and examine the requested boxplot. How do the medians differ by Borough? How many outliers are present, and where are they mostly present?

This exercise is part of the course

Experimental Design in R

View Course

Exercise instructions

  • Create a boxplot of Math SAT scores by Borough.
  • Run the code to include a title: "Average SAT Math Scores by Borough, NYC".
  • Change the x- and y-axis labels to read "Borough (NYC)" and "Average SAT Math Scores (2014-15)", respectively, using the correct arguments to labs().

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Create a boxplot of Math scores by Borough, with a title and x/y axis labels
ggplot(___) +
  ___ + 
  labs(title = "Average SAT Math Scores by Borough, NYC",
  	   ___,
  	   ___)
Edit and Run Code