BaşlayınÜcretsiz Başlayın

Defining a schema

Creating a defined schema helps with data quality and import performance. As mentioned during the lesson, we'll create a simple schema to read in the following columns:

  • Name
  • Age
  • City

The Name and City columns are StringType() and the Age column is an IntegerType().

Bu egzersiz

Cleaning Data with PySpark

kursunun bir parçasıdır
Kursu Görüntüle

Egzersiz talimatları

  • Import * from the pyspark.sql.types library.
  • Define a new schema using the StructType method.
  • Define a StructField for name, age, and city. Each field should correspond to the correct datatype and not be nullable.

Uygulamalı interaktif egzersiz

Bu örnek kodu tamamlayarak bu egzersizi bitirin.

# Import the pyspark.sql.types library
____

# Define a new schema using the StructType method
people_schema = ____([
  # Define a StructField for each field
  StructField('name', ____, False),
  ____('____', IntegerType(), ____)
  ____
])
Kodu Düzenle ve Çalıştır