Session Ready
Exercise

Accessing data

Since the focus of this course is statistical modeling, we'll assume you already know how to get data into R. Many of the datasets will come from a package written specifically for this course: statisticalModeling. This package is already installed on the DataCamp servers. To use it on your own computer, you'll have to install it there.

To access data contained in an R package, you have a few options:

  1. Use the data() function: data("CPS85", package = "mosaicData")
  2. Refer to the package using double-colon notation: mosaicData::CPS85
  3. Load the package, then refer to the dataset by name: library(mosaicData); CPS85

Let's get some quick practice with these three approaches before moving on.

Instructions
100 XP
  • Use the data() function to load the Trucking_jobs data frame from statisticalModeling. Both the name of the dataset and the package it's coming from should be surrounded with double quotes (see example above).
  • Use nrow() to find the number of rows in Trucking_jobs.
  • Use names() to find the names of variables in the mosaicData::Riders data (using double-colon notation).
  • Load the ggplot2 package.
  • Look at the head() of the the diamonds dataset, which is contained in ggplot2. You can refer to the dataset directly by name, since the package is loaded.