Mother's little helper (2)
A more general way of matching columns is to check if their names contain a value anywhere within them (rather than starting or ending with a value). As you may be able to guess, you can do this using a helper named contains()
.
Even more generally, you can match columns using regular expressions. Regular expressions ("regexes" for short) are a powerful language used for matching text. If you want to learn how to use regular expressions, take the *String Manipulation with stringr in R * course. For now, you only need to know three things.
a
: A letter means "match that letter"..
: A dot means "match any character, including letters, numbers, punctuation, etc.".?
: A question mark means "the previous character is optional".
You can find columns that match a particular regex using the matches()
select helper.
This exercise is part of the course
Introduction to Spark with sparklyr in R
Exercise instructions
A Spark connection has been created for you as spark_conn
. A tibble attached to the track metadata stored in Spark has been pre-defined as track_metadata_tbl
.
- Select all columns from
track_metadata_tbl
containing"ti"
. - Select all columns from
track_metadata_tbl
matching the regular expression"ti.?t"
.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# track_metadata_tbl has been pre-defined
track_metadata_tbl
track_metadata_tbl %>%
# Select columns containing ti
___
track_metadata_tbl %>%
# Select columns matching ti.?t
___