Session Ready
Exercise

Introducing the American Community Survey

In this first chapter you will analyze data of the 2013 American Community Survey (ACS) to find out whether it makes sense to pursue a PhD. The end result will be a Kaggle script that you can share with your own Kaggle account.

After binding together the ss13pusa.csv and ss13pusb.csv files that you can find here, we created a subset containing 300.000 observations and 3 variables: SCHL (School Level), PINCP (Income) and ESR (Work Status). This subset is called ac_survey.

Note: A basic understanding of the R syntax is required for this course. In addition, you will need to make use of some basic functions in the dplyr and ggplot2 packages.

Instructions
100 XP
  • acs_url represents the URL of the .RData file that contains the ac_survey data frame. Use acs_url in combination with load() and url() to import it into R.
  • Investigate the first 20 observations of ac_survey using head().