Get startedGet started for free

Extracting data from parquet files

One of the most common ways to ingest data from a source system is by reading data from a file, such as a CSV file. As data has gotten bigger, the need for better file formats has brought about new column-oriented file types, such as parquet files.

In this exercise, you'll practice extracting data from a parquet file.

This exercise is part of the course

ETL and ELT in Python

View Course

Exercise instructions

  • Read the parquet file at the path "sales_data.parquet" into a pandas DataFrame.
  • Check the data types of the DataFrame via print()ing.
  • Output the shape of the DataFrame, as well as it's head.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

import pandas as pd

# Read the sales data into a DataFrame
sales_data = pd.____("____", engine="fastparquet")

# Check the data type of the columns of the DataFrames
print(sales_data.____)

# Print the shape of the DataFrame, as well as the head
print(sales_data.____)
print(sales_data.____())
Edit and Run Code