Using pandas to import flat files as DataFrames (2)

In the last exercise, you were able to import flat files into a pandas DataFrame. As a bonus, it is then straightforward to retrieve the corresponding numpy array using the method .to_numpy(). You'll now have a chance to do this using the MNIST dataset, which is available as digits.csv.

There are a number of arguments that pd.read_csv() takes that you'll find useful for this exercise:

nrows allows you to specify how many rows to read from the file. For example, nrows=10 will only import the first 10 rows.
header accepts row numbers to use as the column labels and marks the start of the data. If the file does not contain a header row, you can set header=None, and pandas will automatically assign integer column labels starting from 0 (e.g., 0, 1, 2, …).

Bu egzersiz

Introduction to Importing Data in Python

kursunun bir parçasıdır

Kursu Görüntüle

Egzersiz talimatları

Import the first 5 rows of the file into a DataFrame using the function pd.read_csv() and assign the result to data. You'll need to use the arguments nrows and header. Note that there is no header row in this file.
Build a numpy array from the resulting DataFrame in data and assign to data_array.
Execute print(type(data_array)) to print the datatype of data_array.

Uygulamalı interaktif egzersiz

Bu örnek kodu tamamlayarak bu egzersizi bitirin.

# Assign the filename: file
file = 'digits.csv'

# Read the first 5 rows of the file into a DataFrame: data
data = ____(____, ____, ____)

# Build a numpy array from the DataFrame: data_array
data_array = ____

# Print the datatype of data_array to the shell
print(type(data_array))

Kodu Düzenle ve Çalıştır