LoslegenKostenlos loslegen

Get data from other flat files

While CSVs are the most common kind of flat file, you will sometimes find files that use different delimiters. read_csv() can load all of these with the help of the sep keyword argument. By default, pandas assumes that the separator is a comma, which is why we do not need to specify sep for CSVs.

The version of Vermont tax data here is a tab-separated values file (TSV), so you will need to use sep to pass in the correct delimiter when reading the file. Remember that tabs are represented as \t. Once the file has been loaded, the remaining code groups the N1 field, which contains income range categories, to create a chart of tax returns by income category.

Diese Übung ist Teil des Kurses

Streamlined Data Ingestion with pandas

Kurs anzeigen

Anleitung zur Übung

  • Import pandas with the alias pd.
  • Load vt_tax_data_2016.tsv, making sure to set the correct delimiter with the sep keyword argument.

Interaktive Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

# Import pandas with the alias pd
____

# Load TSV using the sep keyword argument to set delimiter
data = ____(____, ____)

# Plot the total number of tax returns by income group
counts = data.groupby("agi_stub").N1.sum()
counts.plot.bar()
plt.show()
Code bearbeiten und ausführen