ComenzarEmpieza gratis

Random Forest: prediction

Now you need to make some predictions with your random forest model. The syntax is the same as with the gradient boosted trees model.

Este ejercicio forma parte del curso

Introduction to Spark with sparklyr in R

Ver curso

Instrucciones del ejercicio

A Spark connection has been created for you as spark_conn. Tibbles attached to the training and testing datasets stored in Spark have been pre-defined as track_data_to_model_tbl and track_data_to_predict_tbl respectively. The random forest model has been pre-defined as random_forest_model.

  • Define a variable predicted that contains the model's predictions for our testing data.
    • Call ml_predict() with the model and the testing data as arguments. This function will generate predictions for the testing dataset and add these as a new column named prediction.
  • Define the responses variable to prepare the data for comparing predicted responses with actual responses:
    • Select the response column year.
    • Collect the results.
    • Use mutate() to add in the predictions made in predicted.

Ejercicio interactivo práctico

Prueba este ejercicio completando el código de muestra.

# Training, testing sets & model are pre-defined
track_data_to_model_tbl
track_data_to_predict_tbl
random_forest_model

# Predict the responses for the testing data
predicted <- ml_predict(
      ___,
      ___) %>% pull(prediction)

# Create a response vs. actual dataset
responses <- ___
Editar y ejecutar código