Mother's little helper (2)

A more general way of matching columns is to check if their names contain a value anywhere within them (rather than starting or ending with a value). As you may be able to guess, you can do this using a helper named contains().

Even more generally, you can match columns using regular expressions. Regular expressions ("regexes" for short) are a powerful language used for matching text. If you want to learn how to use regular expressions, take the *String Manipulation with stringr in R * course. For now, you only need to know three things.

a: A letter means "match that letter".
.: A dot means "match any character, including letters, numbers, punctuation, etc.".
?: A question mark means "the previous character is optional".

You can find columns that match a particular regex using the matches() select helper.

A Spark connection has been created for you as spark_conn. A tibble attached to the track metadata stored in Spark has been pre-defined as track_metadata_tbl.

Select all columns from track_metadata_tbl containing "ti".
Select all columns from track_metadata_tbl matching the regular expression "ti.?t".

Light My Fire: Starting To Use Spark With dplyr Syntax

Tools of the Trade: Advanced dplyr Usage

Going Native: Use The Native Interface to Manipulate Spark DataFrames

Case Study: Learning to be a Machine: Running Machine Learning Models on Spark

Exercise

Mother's little helper (2)

Instructions