Aan de slagGa gratis aan de slag

Cross-validation for R-squared

Cross-validation is a vital approach to evaluating a model. It maximizes the amount of data that is available to the model, as the model is not only trained but also tested on all of the available data.

In this exercise, you will build a linear regression model, then use 6-fold cross-validation to assess its accuracy for predicting sales using social media advertising expenditure. You will display the individual score for each of the six-folds.

The sales_df dataset has been split into y for the target variable, and X for the features, and preloaded for you. LinearRegression has been imported from sklearn.linear_model.

Deze oefening maakt deel uit van de cursus

Supervised Learning with scikit-learn

Cursus bekijken

Oefeninstructies

  • Import KFold and cross_val_score.
  • Create kf by calling KFold(), setting the number of splits to six, shuffle to True, and setting a seed of 5.
  • Perform cross-validation using reg on X and y, passing kf to cv.
  • Print the cv_scores.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

# Import the necessary modules
from ____.____ import ____, ____

# Create a KFold object
kf = ____(n_splits=____, shuffle=____, random_state=____)

reg = LinearRegression()

# Compute 6-fold cross-validation scores
cv_scores = ____(____, ____, ____, cv=____)

# Print scores
print(____)
Code bewerken en uitvoeren