Get startedGet started for free

Get data from other flat files

While CSVs are the most common kind of flat file, you will sometimes find files that use different delimiters. read_csv() can load all of these with the help of the sep keyword argument. By default, pandas assumes that the separator is a comma, which is why we do not need to specify sep for CSVs.

The version of Vermont tax data here is a tab-separated values file (TSV), so you will need to use sep to pass in the correct delimiter when reading the file. Remember that tabs are represented as \t. Once the file has been loaded, the remaining code groups the N1 field, which contains income range categories, to create a chart of tax returns by income category.

This exercise is part of the course

Streamlined Data Ingestion with pandas

View Course

Exercise instructions

  • Import pandas with the alias pd.
  • Load vt_tax_data_2016.tsv, making sure to set the correct delimiter with the sep keyword argument.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Import pandas with the alias pd
____

# Load TSV using the sep keyword argument to set delimiter
data = ____(____, ____)

# Plot the total number of tax returns by income group
counts = data.groupby("agi_stub").N1.sum()
counts.plot.bar()
plt.show()
Edit and Run Code