Excluding observations
Besides missing values, there might be other reasons to exclude observations. In our human data, there are a few data points which have been computed from other observations. We want to remove them before further analysis.
The basic way in R to reference the rows or columns of a data frame is to use brackets ([,]
) along with indices or names. A comma is used to separate row and column references. In the examples below, df
is a data frame.
df[,] # select every row and every column
df[1:5, ] # select first five rows
df[, c(2, 5)] # select 2nd and 5th columns
This exercise is part of the course
Helsinki Open Data Science
Exercise instructions
- Use
tail()
to print out the last 10 observations ofhuman
(hint:?tail
). What are the last 10 country names? - Create object
last
- Create data frame
human_
by selecting rows from the 1st tolast
fromhuman
. - Define the rownames in
human_
by the Country column
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# human without NA is available
# look at the last 10 observations of human
# define the last indice we want to keep
last <- nrow(human) - 7
# choose everything until the last 7 observations
human_ <- human["change me!", ]
# add countries as rownames
rownames(human_) <- human_$Country