Specify data types
When loading a flat file, pandas
infers the best data type for each column. Sometimes its guesses are off, particularly for numbers that represent groups or qualities instead of quantities.
Looking at the data dictionary for vt_tax_data_2016.csv
reveals two such columns. The agi_stub
column contains numbers that correspond to income categories, and zipcode
has 5-digit values that should be strings -- treating them as integers means we lose leading 0s, which are meaningful. Let's specify the correct data types with the dtype
argument.
pandas
has been imported for you as pd
.
Este ejercicio forma parte del curso
Streamlined Data Ingestion with pandas
Ejercicio interactivo práctico
Prueba este ejercicio completando el código de muestra.
# Load csv with no additional arguments
data = ____("vt_tax_data_2016.csv")
# Print the data types
print(____)