Aan de slagGa gratis aan de slag

Logging within a data pipeline

In this exercise, we'll take a look back at the function you wrote in a previous video and practice adding logging to the function. This will help when troubleshooting errors or making changes to the logic!

pandas has been imported as pd. In addition to this, the logging module has been imported, and the default log-level has been set to "debug".

Deze oefening maakt deel uit van de cursus

ETL and ELT in Python

Cursus bekijken

Oefeninstructies

  • Create an info-level log after the transformation, passing the string: "Transformed 'Order Date' column to type 'datetime'."
  • Log the .shape of the DataFrame at the debug-level before and after filtering.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

def transform(raw_data):
    raw_data["Order Date"] = pd.to_datetime(raw_data["Order Date"], format="%m/%d/%y %H:%M")
    clean_data = raw_data.loc[raw_data["Price Each"] < 10, :]
    
    # Create an info log regarding transformation
    logging.____("Transformed 'Order Date' column to type 'datetime'.")
    
    # Create debug-level logs for the DataFrame before and after filtering
    ____(f"Shape of the DataFrame before filtering: {raw_data.shape}")
    ____(f"Shape of the DataFrame after filtering: {clean_data.shape}")
    
    return clean_data
  
clean_sales_data = transform(raw_sales_data)
Code bewerken en uitvoeren