Aan de slagGa gratis aan de slag

Load multiple data files

It's perfectly fine to manually import multiple datasets. However, there will be times when you'd want to import a bunch of datasets without having to make multiple read_csv() calls. You can use the glob library that is built into Python to look for files that match a pattern. The library is called "glob" because "globbing" is the way patterns are specified in the Bash shell.

The glob() function returns a list of filenames that match a specified pattern. You can then use a list comprehension to import multiple files into a list, and then you can extract the DataFrame of interest.

Deze oefening maakt deel uit van de cursus

Python for R Users

Cursus bekijken

Oefeninstructies

  • Obtain a list of all csv files in your current directory and assign it to csv_files.
  • Write a list comprehension that reads in all the csv files into a list, dfs.
  • Write a list comprehension that looks at the .shape of each DataFrame in the list.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

import glob
import pandas as pd

# Get a list of all the csv files
csv_files = glob.____('*.csv')

# List comprehension that loads of all the files
dfs = [pd.read_csv(____) for ____ in ____]

# List comprehension that looks at the shape of all DataFrames
print(____)
Code bewerken en uitvoeren