Mother's little helper (1)
If your dataset has thousands of columns, and you want to select a lot of them, then typing the name of each column when you call select()
can be very tedious. Fortunately, select()
has some helper functions to make it easy to select multiple columns without typing much code.
These helpers include starts_with()
and ends_with()
, that match columns that start or end with a certain prefix or suffix respectively. Due to dplyr
's special code evaluation techniques, these functions can only be called from inside a call to select()
; they don't make sense on their own.
This exercise is part of the course
Introduction to Spark with sparklyr in R
Exercise instructions
A Spark connection has been created for you as spark_conn
. A tibble attached to the track metadata stored in Spark has been pre-defined as track_metadata_tbl
.
- Select all columns from
track_metadata_tbl
starting with"artist"
. - Select all columns from
track_metadata_tbl
ending with"id"
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# track_metadata_tbl has been pre-defined
track_metadata_tbl
track_metadata_tbl %>%
# Select columns starting with artist
___
track_metadata_tbl %>%
# Select columns ending with id
___