1. Type-Specific Expectations
Let's take a look at some more column-level Expectations, now focusing on type-specific ones. First, we'll focus on numeric type columns.
2. Numeric Expectations: aggregate-level
Let's take a look at some aggregate-level numeric Expectations. We can set Expectations about the mean of a column, or the median, or the standard deviation, or the sum. Note that because it's unlikely that we'll ever know the exact value of the mean, median, standard deviation, or sum of a column, these Expectations only exist in the "Expect{Value}ToBeBetween" form, which takes a `min_value` and a `max_value`, and not in the "Expect{Value}ToEqual" form, which just takes one singular `value`.
3. Numeric Expectations: row-level
We can also establish numeric-type Expectations at the row level. Just like with the aggregate values we looked at in the previous slide (the mean, median, standard deviation, and sum), we can also expect each individual value in the column to be within a range. For instance, we know that the star rating of a product should always fall within the 0 to 5 scale. If we expect our data to be sorted by particular column, such as `"price"`, we can also set an Expectation for that column to be increasing or decreasing.
4. String Expectations
Now let's look at some string-type Expectations. If we know that the values of a particular string column, such as the SKU ID, should always be of a certain length, then we can set an Expectation for the length of that string. We can also set an Expectation that a column follow a specific Regular Expression, or regex, pattern. For instance, the link to a product should always start with the company website, followed by a string corresponding to the particular product. Sometimes we'll have columns that can follow any of multiple regex patterns. In this case, we can create a list of regex patterns, expecting that the each column value should match one pattern in the list. For example, we can consider the hero image column. A hero image is a large, attention-grabbing image displayed on a web page. If we know that the hero image for the products can come from one of two places, then we can set an Expectation that the `"hero_image"` column match one of two regex patterns.
Note that all of the Expectations on this slide are at the row level, since they apply to each row individually and do not perform any aggregation.
5. String parseability Expectations
For string columns, we can also set Expectations for the column to be parseable according to a particular format. For instance, we can set an Expectation that a column be `dateutil`-parseable if we expect that the column should be a date. Or we can set an Expectation that a column be JSON-parseable if we expect it to contain JSON-formatted data. Note that in this case, we would expect both of these Expectations to fail, since the product color should not be a date or JSON string.
6. Cheat sheet
Here's a review of the Expectations in this video. Feel free to refer back to this as you complete the exercises.
7. Let's practice!
Time for you to take a stab at establishing some numeric- and string-type Expectations!