Practice using pandas to get just the data you want from flat files, learn how to wrangle data types and handle errors, and look into some U.S. tax data along the way.

Introduction to flat files

Get data from CSVs

Get data from other flat files

Modifying flat file imports

Import a subset of columns

Import a file in chunks

Handling errors and missing data

Specify data types

Set custom NA values

Skip bad data

Importing Data from Flat Files

Automate data imports from that staple of office life, Excel files. Import part or all of a workbook and ensure boolean and datetime data are properly loaded, all while learning about how other people are learning to code.

Introduction to spreadsheets

Get data from a spreadsheet

Load a portion of a spreadsheet

Getting data from multiple worksheets

Select a single sheet

Select multiple sheets

Work with multiple spreadsheets

Modifying imports: true/false data

Set Boolean columns

Set custom true/false values

Modifying imports: parsing dates

Parse simple dates

Get datetimes from multiple columns

Parse non-standard date formats

Importing Data From Excel Files

Combine pandas with the powers of SQL to find out just how many problems New Yorkers have with their housing. This chapter features introductory SQL topics like WHERE clauses, aggregate functions, and basic joins.

Introduction to databases

Connect to a database

Load entire tables

Refining imports with SQL queries

Selecting columns with SQL

Selecting rows

Filtering on multiple conditions

More complex SQL queries

Getting distinct values

Counting in groups

Working with aggregate functions

Loading multiple tables with joins

Joining tables

Joining and filtering

Joining, filtering, and aggregating

Importing Data from Databases

Learn how to work with JSON data and web APIs by exploring a public dataset and getting cafe recommendations from Yelp. End by learning some techniques to combine datasets once they have been loaded into data frames.

Introduction to JSON

Load JSON data

Work with JSON orientations

Introduction to APIs

Get data from an API

Set API parameters

Set request headers

Working with nested JSONs

Flatten nested JSONs

Handle deeply nested data

Combining multiple datasets

Concatenate dataframes

Merge dataframes

Wrap-up

Importing JSON Data and Working with APIs

Vermont tax return data by ZIP code

FreeCodeCamp New Developer Survey response subset

NYC weather and 311 housing complaints

Before you can analyze data, you first have to acquire it. This course teaches you how to build pipelines to import data kept in common storage formats. You’ll use pandas, a major Python library for analytics, to get data from a variety of sources, from spreadsheets of survey responses, to a database of public service requests, to an API for a popular review site. Along the way, you’ll learn how to fine-tune imports to get only what you need and to address issues like incorrect data types. Finally, you’ll assemble a custom dataset from a mix of sources.

Intermediate Python

Intermediate SQL

Learn to acquire data from common file formats and systems such as CSV files, spreadsheets, JSON, SQL databases, and APIs.

Streamlined Data Ingestion with pandas

Data Engineer in Python

Merge dataframes

Streamlined Data Ingestion with pandas

Hands-on interactive exercise