Name: Data Cleaning and Munging on the Command Line - Stacking data and chaining commands with csvkit
Uploaded: 2024-08-02T00:00:00Z
Description: We continue our data journey from data downloading to data processing. In this chapter, we utilize the command line library csvkit to convert, preview, filter and manipulate files to prepare our data for further analyses.

1
Downloading Data on the Command Line
Free
In this chapter, we learn how to download data files from web servers via the command line. In the process, we also learn about documentation manuals, option flags, and multi-file processing.
2
Data Cleaning and Munging on the Command Line
We continue our data journey from data downloading to data processing. In this chapter, we utilize the command line library csvkit to convert, preview, filter and manipulate files to prepare our data for further analyses.
3
Database Operations on the Command Line
In this chapter, we dig deeper into all that csvkit library has to offer. In particular, we focus on database operations we can do on the command line, including table creation, data pull, and various ETL transformation.
4
Data Pipeline on the Command Line
In the last chapter, we bridge the connection between command line and other data science languages and learn how they can work together. Using Python as a case study, we learn to execute Python on the command line, to install dependencies using the package manager pip, and to build an entire model pipeline using the command line.

Initializing