Get data from other flat files
While CSVs are the most common kind of flat file, you will sometimes find files that use different delimiters. read_csv() can load all of these with the help of the sep keyword argument. By default, pandas assumes that the separator is a comma, which is why we do not need to specify sep for CSVs.
The version of Vermont tax data here is a tab-separated values file (TSV), so you will need to use sep to pass in the correct delimiter when reading the file. Remember that tabs are represented as \t. Once the file has been loaded, the remaining code groups the N1 field, which contains income range categories, to create a chart of tax returns by income category.
Este exercício faz parte do curso
Streamlined Data Ingestion with pandas
Instruções do exercício
- Import
pandaswith the aliaspd. - Load
vt_tax_data_2016.tsv, making sure to set the correct delimiter with thesepkeyword argument.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Import pandas with the alias pd
____
# Load TSV using the sep keyword argument to set delimiter
data = ____(____, ____)
# Plot the total number of tax returns by income group
counts = data.groupby("agi_stub").N1.sum()
counts.plot.bar()
plt.show()