More datasets
Welcome to the Logistic regression chapter.
During the next exercises, we will be combining, wrangling and analysing two new data sets retrieved from the UCI Machine Learning Repository, a great source for open data.
The data are from two identical questionaires related to secondary school student alcohol comsumption in Portugal. Read about the data and the variables here.
R offers the convenient paste()
function which makes it easy to combine characters. Let's utilize it to get our hands on the data!
This exercise is part of the course
Helsinki Open Data Science
Exercise instructions
- Create and print out the object
url_math
. - Create object
math
by reading the math class questionaire data from the web address defined inurl_math
. - Create and print out
url_por
. - Adjust the code: similarily to
url_math
, makeurl_por
into a valid web address usingpaste()
and theurl
object. - Create object
por
by reading the portuguese class questionaire data from the web address defined inurl_por
. - Print out the names of the columns in both data sets.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
url <- "http://s3.amazonaws.com/assets.datacamp.com/production/course_2218/datasets"
# web address for math class data
url_math <- paste(url, "student-mat.csv", sep = "/")
# print out the address
url_math
# read the math class questionaire data into memory
math <- read.table(url_math, sep = ";" , header=TRUE)
# web address for portuguese class data
url_por <- paste("replace me!", "student-por.csv", sep =" change me! ")
# print out the address
# read the portuguese class questionaire data into memory
por <- read.table(url_por, sep = ";", header = TRUE)
# look at the column names of both data
colnames(math)