Defining the schema
Let's start by defining the expected schema for data validation. This is a critical step in ensuring data quality throughout the ETL pipeline.
You'll use the pointblank library to define the schema structure.
The dataset has already been loaded for you as ts.
Diese Übung ist Teil des Kurses
Designing Forecasting Pipelines for Production
Anleitung zur Übung
- Start by importing
pointblank. - Define the schema using the right method.
- Set the
respondentcolumn toobjecttype andvaluecolumn tofloat64type.
Interaktive Übung
Vervollständige den Beispielcode, um diese Übung erfolgreich abzuschließen.
# Import the required library
import ____ as ____
# Define the schema and set columns
table_schema = pb.____(
columns=[
("period", "datetime64[ns]"),
("respondent", "____"),
("respondent-name", "object"),
("type", "object"),
("type-name", "object"),
("value", "____"),
("value-units", "object")])
print(table_schema)