BaşlayınÜcretsiz Başlayın

Flight duration model: Just distance

In this exercise you'll build a regression model to predict flight duration (the duration column).

For the moment you'll keep the model simple, including only the distance of the flight (the km column) as a predictor.

The data are in flights. The first few records are displayed in the terminal. These data have also been split into training and testing sets and are available as flights_train and flights_test.

Bu egzersiz

Machine Learning with PySpark

kursunun bir parçasıdır
Kursu Görüntüle

Egzersiz talimatları

  • Create a linear regression object. Specify the name of the label column. Fit it to the training data.
  • Make predictions on the testing data.
  • Create a regression evaluator object and use it to evaluate RMSE on the testing data.

Uygulamalı interaktif egzersiz

Bu örnek kodu tamamlayarak bu egzersizi bitirin.

from pyspark.ml.regression import LinearRegression
from pyspark.ml.evaluation import RegressionEvaluator

# Create a regression object and train on training data
regression = ____(____).____(____)

# Create predictions for the testing data and take a look at the predictions
predictions = ____.____(____)
predictions.select('duration', 'prediction').show(5, False)

# Calculate the RMSE
____(____).____(predictions)
Kodu Düzenle ve Çalıştır