MulaiMulai sekarang secara gratis

Create the pipeline

You're finally ready to create a Pipeline!

Pipeline is a class in the pyspark.ml module that combines all the Estimators and Transformers that you've already created. This lets you reuse the same modeling process over and over again by wrapping it up in one simple object. Neat, right?

Latihan ini adalah bagian dari kursus

Foundations of PySpark

Lihat Kursus

Petunjuk latihan

  • Import Pipeline from pyspark.ml.
  • Call the Pipeline() constructor with the keyword argument stages to create a Pipeline called flights_pipe.
    • stages should be a list holding all the stages you want your data to go through in the pipeline. Here this is just: [dest_indexer, dest_encoder, carr_indexer, carr_encoder, vec_assembler]

Latihan interaktif praktis

Cobalah latihan ini dengan menyelesaikan kode contoh berikut.

# Import Pipeline
from ____ import ____

# Make the pipeline
flights_pipe = Pipeline(stages=____)
Edit dan Jalankan Kode