Aan de slagGa gratis aan de slag

Extracting data from parquet files

One of the most common ways to ingest data from a source system is by reading data from a file, such as a CSV file. As data has gotten bigger, the need for better file formats has brought about new column-oriented file types, such as parquet files.

In this exercise, you'll practice extracting data from a parquet file.

Deze oefening maakt deel uit van de cursus

ETL and ELT in Python

Cursus bekijken

Oefeninstructies

  • Read the parquet file at the path "sales_data.parquet" into a pandas DataFrame.
  • Check the data types of the DataFrame via print()ing.
  • Output the shape of the DataFrame, as well as it's head.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

import pandas as pd

# Read the sales data into a DataFrame
sales_data = pd.____("____", engine="fastparquet")

# Check the data type of the columns of the DataFrames
print(sales_data.____)

# Print the shape of the DataFrame, as well as the head
print(sales_data.____)
print(sales_data.____())
Code bewerken en uitvoeren