Mulai sekarangMulai gratis

Parsing a messy CSV

A third-party ebook vendor exports Seattle digital checkouts as a semicolon-separated CSV with two metadata rows above the real header. Configure your scan to handle the layout so the team can preview a clean table.

polars is loaded as pl. The path to the vendor file is in MESSY_CSV_PATH.

Latihan ini merupakan bagian dari kursus

Scaling and Optimizing Data Pipelines with Polars

Lihat Kursus

Instruksi latihan

  • Skip the 2 metadata rows above the header.
  • Tell Polars that the columns are separated by semicolons.

Latihan interaktif langsung praktik

Cobalah latihan ini dengan melengkapi kode contoh ini.

result = pl.scan_csv(
    MESSY_CSV_PATH,
    # Skip the 2 metadata rows above the header
    skip_rows=____,
    # Columns are separated by semicolons
    separator="____",
).head(5).collect()
print(result)
Edit dan Jalankan Kode