1. Apply Expectations to New Data
Welcome back! Now that we've learned how to create and run Validation Definitions, let's learn how to deploy them using Checkpoints.
2. Checkpoints
Checkpoints are objects that group Validation Definitions and run them with shared parameters.
They are the primary means for validating data in a production deployment of GX.
Checkpoints can also be configured to perform automated Actions based on the Validation Results. Some examples of supported Actions are sending notifications via email, Microsoft Teams, or Slack if a validation fails.
3. Why use Checkpoints?
Checkpoints are helpful for two main reasons.
First, they allow us to run multiple Validation Definitions against the same Batch of data.
Second, they can be configured to automatically perform Actions like the ones we touched on in the last slide. Checkpoint Actions require some configuration, so we won't go into depth on them in this course.
4. Creating a Checkpoint
To create a Checkpoint, we use GX's `Checkpoint` class, which has two required arguments:
the name we want to give our Checkpoint,
and a list of Validation Definitions to run.
We can also optionally add a list of Actions to run, as I mentioned.
5. Checkpoint errors
Recall that we had to add our Expectation Suite to the Data Context before running it through a Validation Definition. Similarly, we need to add our Validation Definition before running it through a Checkpoint, or else we'll get an error like this, telling us our Validation Definition hasn't been added.
6. Adding a Validation Definition
We add our Validation Definition to the Data Context in an analogous way to how we added our Expectation Suite. We use the `.add()` method of the Context's `.validation_definitions` attribute, passing in our Validation Definition as an argument.
7. Running a Checkpoint
We run a Checkpoint the exact same way we would a single Validation Definition: we use the `.run()` method and pass in our DataFrame as a dictionary to the `batch_parameters` argument.
This outputs a dictionary-like object with the results of the run, similar to Validation Results.
As you can see, this output is not the easiest to read, and it's only a small fraction of the entire output! There are other ways to represent our Checkpoint Results in a prettier way.
8. Assessing Checkpoint Results
One way to view our Checkpoint Results is via their `.success` attribute.
As with Validation Results, this attribute will be True if ALL of our Expectations were met, otherwise False.
This approach is helpful for knowing whether or not the entire validation succeeded, but it doesn't give us any insight into the results on an Expectation level.
For that, we can use the same `.describe()` method we've been using for Validation Results.
9. Assessing Checkpoint Results
This looks like our Validation Results -- better, but still a bit tedious.
10. Data Docs
Another tool we can use are Data Docs, which translate Expectations, Validation Results, and other metadata into human-readable documentation.
They can be populated using Checkpoint Actions.
11. Data Docs
Data Docs will open as a static webpage, with a UI displaying each Expectation, whether or not it succeeded, and any associated metadata. Pretty neat, huh?
While we won't go into them in depth in this course, it's worth knowing what they are and why they're useful.
12. Cheat sheet
Here's a summary slide on all things Checkpoints. Feel free to refer back to this while completing the exercises.
13. Let's practice!
Time to practice creating your own Checkpoints!