Model stacking I

Now it's time for stacking. To implement the stacking approach, you will follow the 6 steps we've discussed in the previous video:

Split train data into two parts
Train multiple models on Part 1
Make predictions on Part 2
Make predictions on the test data
Train a new model on Part 2 using predictions as features
Make predictions on the test data using the 2nd level model

train and test DataFrames are already available in your workspace. features is a list of columns to be used for training on the Part 1 data and it is also available in your workspace. Target variable name is "fare_amount".

Split the train DataFrame into two equal parts: part_1 and part_2. Use the train_test_split() function with test_size equal to 0.5.
Train Gradient Boosting and Random Forest models on the part_1 data.

Kaggle competitions process

Dive into the Competition

Feature Engineering

Modeling

Exercise

Model stacking I

Instructions 1/2