Using .SD (I)
.SD
together with .SDcols
is an incredibly powerful feature that makes computing on multiple columns so much easier.
.SD
is a special symbol which stands for Subset of Data.SDcols
holds the columns that should be included in.SD
This exercise is part of the course
Data Manipulation with data.table in R
Exercise instructions
- For each month, find the row corresponding to the shortest trip (by using
which.min()
onduration
). - The result should contain the
month
,start_station
,end_station
,start_date
,end_date
, andduration
columns.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
relevant_cols <- c("start_station", "end_station",
"start_date", "end_date", "duration")
# Find the row corresponding to the shortest trip per month
shortest <- batrips[, ___,
by = month(start_date),
.SDcols = ___]
shortest