Get startedGet started for free

Mother's little helper (2)

A more general way of matching columns is to check if their names contain a value anywhere within them (rather than starting or ending with a value). As you may be able to guess, you can do this using a helper named contains().

Even more generally, you can match columns using regular expressions. Regular expressions ("regexes" for short) are a powerful language used for matching text. If you want to learn how to use regular expressions, take the *String Manipulation with stringr in R * course. For now, you only need to know three things.

  1. a: A letter means "match that letter".
  2. .: A dot means "match any character, including letters, numbers, punctuation, etc.".
  3. ?: A question mark means "the previous character is optional".

You can find columns that match a particular regex using the matches() select helper.

This exercise is part of the course

Introduction to Spark with sparklyr in R

View Course

Exercise instructions

A Spark connection has been created for you as spark_conn. A tibble attached to the track metadata stored in Spark has been pre-defined as track_metadata_tbl.

  • Select all columns from track_metadata_tbl containing "ti".
  • Select all columns from track_metadata_tbl matching the regular expression "ti.?t".

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# track_metadata_tbl has been pre-defined
track_metadata_tbl

track_metadata_tbl %>%
  # Select columns containing ti
  ___

track_metadata_tbl %>%
  # Select columns matching ti.?t
  ___
Edit and Run Code