Exercise

Introducing the American Community Survey

In this first chapter you will analyze data of the 2013 American Community Survey (ACS) to find out whether it makes sense to pursue a PhD. The end result will be a Kaggle script that you can share with your own Kaggle account.

After binding together the ss13pusa.csv and ss13pusb.csv files that you can find here, we created a subset containing 300.000 observations and 3 variables: SCHL (School Level), PINCP (Income) and ESR (Work Status). This subset is called ac_survey.

Note: A basic understanding of the R syntax is required for this course. In addition, you will need to make use of some basic functions in the dplyr and ggplot2 packages.

Instructions

100 XP
  • acs_url represents the URL of the .RData file that contains the ac_survey data frame. Use acs_url in combination with load() and url() to import it into R.
  • Investigate the first 20 observations of ac_survey using head().