NYC SAT Scores Factorial EDA
Let's do some more EDA before we dive into the analysis of our factorial experiment.
Let's test the effect of Percent_Black_HL
, Percent_Tested_HL
, and Tutoring_Program
on the outcome, Average_Score_SAT_Math
. The HL
stands for high-low, where a 1 indicates respectively that less than 50% of Black students or that less than 50% of all students in an entire school were tested, and a 2 indicates that greater than 50% of either were tested.
Build a boxplot of each factor vs. the outcome to have an idea of which have a difference in median by factor level (ultimately, mean difference is what's tested.) The nyc_scores
dataset has been loaded for you.
This exercise is part of the course
Experimental Design in R
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Load ggplot2
___
# Build the boxplot for the tutoring program vs. Math SAT score
ggplot(___,
aes(___, ___)) +
geom_boxplot()