LoslegenKostenlos starten

Data processing with csvkit

Once we have assembled a dataset, we still need to process and clean the data prior to more advanced analysis such as predictive modeling. In this capstone exercise, let's make use of various commands in csvkit for some common data processing and cleaning.

The Excel file Spotify_201809_201810.xlsx contains two sheets (tabs), named Spotify201809 and Spotify201810. First, we will split the Excel file down to its individual sheets, preview summary statistics, remove some columns, and then stack the two sheets back together again to form one single csv file, ready for further analysis.

Diese Übung ist Teil des Kurses

<Kurs>Data Processing in Shell</Kurs>
Kurs ansehen

Interaktive praktische Übung

Versuche dich an dieser Übung, indem du diesen Beispielcode vervollständigst.

# Convert the Spotify201809 sheet into its own csv file 
___ Spotify_201809_201810.xlsx ___ "___" ___ Spotify201809.csv

# Check to confirm name and location of data file
ls
Code bearbeiten und ausführen