Aan de slagGa gratis aan de slag

Model stacking II

OK, what you've done so far in the stacking implementation:

  1. Split train data into two parts
  2. Train multiple models on Part 1
  3. Make predictions on Part 2
  4. Make predictions on the test data

Now, your goal is to create a second level model using predictions from steps 3 and 4 as features. So, this model is trained on Part 2 data and then you can make stacking predictions on the test data.

part_2 and test DataFrames are already available in your workspace. Gradient Boosting and Random Forest predictions are stored in these DataFrames under the names "gb_pred" and "rf_pred", respectively.

Deze oefening maakt deel uit van de cursus

Winning a Kaggle Competition in Python

Cursus bekijken

Oefeninstructies

  • Train a Linear Regression model on the Part 2 data using Gradient Boosting and Random Forest models predictions as features.
  • Make predictions on the test data using Gradient Boosting and Random Forest models predictions as features.

Praktische interactieve oefening

Probeer deze oefening eens door deze voorbeeldcode in te vullen.

from sklearn.linear_model import LinearRegression

# Create linear regression model without the intercept
lr = LinearRegression(fit_intercept=False)

# Train 2nd level model on the Part 2 data
lr.____(part_2[['gb_pred', '____']], part_2.fare_amount)

# Make stacking predictions on the test data
test['stacking'] = lr.____(test[['gb_pred', '____']])

# Look at the model coefficients
print(lr.coef_)
Code bewerken en uitvoeren