Get Started

Load multiple data files

It's perfectly fine to manually import multiple datasets. However, there will be times when you'd want to import a bunch of datasets without having to make multiple read_csv() calls. You can use the glob library that is built into Python to look for files that match a pattern. The library is called "glob" because "globbing" is the way patterns are specified in the Bash shell.

The glob() function returns a list of filenames that match a specified pattern. You can then use a list comprehension to import multiple files into a list, and then you can extract the DataFrame of interest.

This is a part of the course

“Python for R Users”

View Course

Exercise instructions

  • Obtain a list of all csv files in your current directory and assign it to csv_files.
  • Write a list comprehension that reads in all the csv files into a list, dfs.
  • Write a list comprehension that looks at the .shape of each DataFrame in the list.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

import glob
import pandas as pd

# Get a list of all the csv files
csv_files = glob.____('*.csv')

# List comprehension that loads of all the files
dfs = [pd.read_csv(____) for ____ in ____]

# List comprehension that looks at the shape of all DataFrames
print(____)
Edit and Run Code

This exercise is part of the course

Python for R Users

IntermediateSkill Level
4.6+
8 reviews

This course is for R users who want to get up to speed with Python!

As a final capstone, you will apply your Python skills on the NYC Flights 2013 dataset.

Exercise 1: NYC flights dataExercise 2: Load multiple data files
Exercise 3: ExploreExercise 4: VisualizeExercise 5: Manipulating dataExercise 6: Recode datesExercise 7: Groupby aggregatesExercise 8: PlotsExercise 9: Dummy variablesExercise 10: Wrap-up

What is DataCamp?

Learn the data skills you need online at your own pace—from non-coding essentials to data science and machine learning.

Start Learning for Free