Get startedGet started for free

Create Expectations

1. Create Expectations

Welcome back! We've practiced connecting to data in Great Expectations, and now we're ready to begin working with the Expectations themselves.

2. Expectations

To review, Expectations are verifiable assertions about data. These can include assertions about specific columns. They can also include assertions about the dataset's shape or schema -- which is basically the blueprint or structure of the dataset and includes things such as column names and data types.

3. Expectations

We'll focus on shape and schema Expectations in this chapter.

4. The Renewable Power Generation dataset

We'll be using the Renewable Power Generation dataset from Kaggle. We print the first and last rows here. Take note of the table shape here at the bottom.

5. Creating an Expectation

All Expectations in GX are housed in the `gx.expectations` submodule. They are classes beginning with the word "Expect". Take note that all classes in GX are written in Pascal case, which distinguishes words in the class name by capitalizing the first letter -- as opposed to functions (and methods), which use snake case and separate words with underscores.

6. Creating an Expectation

Let's take a look at an example Expectation. As we saw a couple of slides ago, the dataset has approximately 118,000 rows. Suppose we know this. We could write an Expectation using the `ExpectTableRowCountToEqual` class of the `gx.expectations` submodule, with the `value` parameter set to `118000`. To validate the Expectation, we run the `.validate()` method of our Batch object, with the `expect` parameter set equal to our Expectation.

7. Assessing an Expectation

This Validation returns a Validation Result -- a dictionary with some metadata for the Expectation, including the success status of the Expectation and the observed value, in this case, for row count. As we can see, this is a lot of information and can be difficult to parse through.

8. Assessing an Expectation

We can use the Validation Result's `.describe()` method to get a summary of the important information, namely, the details of the Expectation, whether or not it succeeded, and the actual value we're comparing our Expectation against, as well as the Batch ID.

9. Assessing an Expectation

If we're just interested in the success status of the Validation Result, we can use its `.success` attribute, or we can use the `"success"` key word like a dictionary key. Notice how these output the same thing. Also note that this Expectation returns `False`.

10. Assessing an Expectation

To see why our Expectation failed, we can look at the observed value of the Validation Result. Similar to how we did with `success`, we can use the `results` attribute or key. As we can see in the result output, the dataset row count is not exactly 118,000. This explains why the Expectation failed.

11. Other common Expectations

There are also other Expectations, such as `ExpectTableColumnCountToEqual`, which validates the table's column count; `ExpectTableColumnsToMatchSet`, which validates the set of the table's column names; or `ExpectColumnToExist`, which validates the names of individual columns. We'll become more familiar with these in the next video.

12. Cheat sheet

Here is a review of creating and validating Expectations. You can refer back to this slide as you complete the exercises.

13. Let's practice!

Now you know how to write and validate some Expectations in GX. How exciting! Time to practice!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.