ComenzarEmpieza gratis

Box plots by groups

Box plots are an excellent way of displaying and comparing distributions. A box plot visualizes the 25th, 50th and 75th percentiles (the box), the typical range (the whiskers) and the outliers of a variable.

The whiskers extending from the box can be computed by several techniques. The default (in base R and ggplot) is to extend them to reach to a data point that is no more than 1.5*IQR away from the box, where IQR is the inter quartile range defined as

IQR = 75th percentile - 25th percentile

Values outside the whiskers can be considered as outliers, unusually distant observations. For more information on IQR, see wikipedia.

Este ejercicio forma parte del curso

Helsinki Open Data Science

Ver curso

Instrucciones del ejercicio

  • Initialize and plot of student grades (G3), with high_use grouping the grade distributions on the x-axis. Draw the plot as a box plot.
  • Add an aesthetix element to the plot by defining col = sex inside aes()
  • Define a similar (box) plot of the variable absences grouped by high_use on the x-asis and the aesthetic col = sex.
  • Add a main title to the last plot with ggtitle("title here"). Use "Student absences by alcohol consumption and sex" as a title.
  • Does high use of alcohol have a connection to school absences?

Ejercicio interactivo práctico

Prueba este ejercicio y completa el código de muestra.

library(ggplot2)

# initialize a plot of high_use and G3
g1 <- ggplot(alc, aes(x = high_use, y = G3))

# define the plot as a boxplot and draw it
g1 + geom_boxplot() + ylab("grade")

# initialise a plot of high_use and absences


# define the plot as a boxplot and draw it

Editar y ejecutar código