Excluding observations
Besides missing values, there might be other reasons to exclude observations. In our human data, there are a few data points which have been computed from other observations. We want to remove them before further analysis.
The basic way in R to reference the rows or columns of a data frame is to use brackets ([,]) along with indices or names. A comma is used to separate row and column references. In the examples below, df is a data frame.
df[,] # select every row and every column
df[1:5, ] # select first five rows
df[, c(2, 5)] # select 2nd and 5th columns
This exercise is part of the course
Helsinki Open Data Science
Exercise instructions
- Use tail()to print out the last 10 observations ofhuman(hint:?tail). What are the last 10 country names?
- Create object last
- Create data frame human_by selecting rows from the 1st tolastfromhuman.
- Define the rownames in human_by the Country column
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# human without NA is available
# look at the last 10 observations of human
# define the last indice we want to keep
last <- nrow(human) - 7
# choose everything until the last 7 observations
human_ <- human["change me!", ]
# add countries as rownames
rownames(human_) <- human_$Country