Sneak Peek into GX
Nice job with creating your Data Context! This is the powerful first step into the world of Great Expectations. Let's take a sneak peek at all of the cool things you'll be able to do by the end of the course.
The code on the right uses the Data Context to create a pandas Data Source and Data Asset, which define the format of the data. Then, it creates a Batch Definition to read in the data. Finally, it creates an Expectation Suite, which contains an Expectation, and a Validation Definition, which runs the Expectation Suite against the Batch of data. Don't worry that you don't understand these terms right now -- it'll all be clear by the end of the course!
Great Expectations has already been imported for you as gx.
Este exercício faz parte do curso
Introduction to Data Quality with Great Expectations
Instruções do exercício
- Press
Run Codeto see the code output.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Create Data Context
context = gx.get_context()
# Create pandas Data Source, Data Asset, and Batch Definition
data_source = context.data_sources.add_pandas(
name="my_pandas_datasource"
)
data_asset = data_source.add_dataframe_asset(
name="my_data_asset"
)
batch_definition = data_asset.add_batch_definition_whole_dataframe(
name="my_batch_definition"
)
batch = batch_definition.get_batch(
batch_parameters={"dataframe": dataframe}
)
# Create Expectation Suite and Validation Definition
suite = context.suites.add(
gx.ExpectationSuite(name="my_suite", suite_parameters={})
)
validation_definition = gx.ValidationDefinition(
data=batch_definition, suite=suite, name="validation"
)
# Establish and evaluate an Expectation
expectation = gx.expectations.ExpectTableRowCountToBeBetween(
min_value=50000, max_value=100000
)
validation_results = batch.validate(expect=expectation)
print(validation_results.success)