LoslegenKostenlos loslegen

Visualizing many variables

As you begin to consider more variables, plotting them all at the same time becomes increasingly difficult. In addition to using x and y scales for two numeric variables, you can use color for a third numeric variable, and you can use faceting for categorical variables. And that's about your limit before the plots become to difficult to interpret. There are some specialist plot types like correlation heatmaps and parallel coordinates plots that will handle more variables, but they give you much less information about each variable, and they aren't great for visualizing model predictions.

Here you'll push the limits of the scatter plot by showing the house price, the distance to the MRT station, the number of nearby convenience stores, and the house age, all together in one plot.

taiwan_real_estate is available; ggplot2 is loaded.

Diese Übung ist Teil des Kurses

Intermediate Regression in R

Kurs anzeigen

Anleitung zur Übung

  • Using the taiwan_real_estate dataset, draw a scatter plot of n_convenience versus the square root of dist_to_mrt_m, colored by price_twd_msq.
  • Use the continuous viridis plasma color scale.
  • Facet the plot, wrapping by house_age_years.

Interaktive Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

# Using taiwan_real_estate, no. of conv. stores vs. sqrt of dist. to MRT, colored by plot house price
___ +
  # Make it a scatter plot
  ___ +
  # Use the continuous viridis plasma color scale
  ___ +
  # Facet, wrapped by house age
  ___
Code bearbeiten und ausführen