Exercise

Predicting the rating of an app

Having explored the Google apps dataset in the previous exercise, let's now build a model that predicts the rating of an app given a subset of its features.

To do this, you'll use scikit-learn's DecisionTreeRegressor. As decision trees are the building blocks of many ensemble models, refreshing your memory of how they work will serve you well throughout this course.

We'll use the MAE (mean absolute error) as the evaluation metric. This metric is highly interpretable, as it represents the average absolute difference between actual and predicted ratings.

All required modules have been pre-imported for you. The features and target are available in the variables X and y, respectively.

Instructions

100 XP
  • Use train_test_split() to split X and y into train and test sets. Use 20%, or 0.2, as the test size.
  • Instantiate a DecisionTreeRegressor(), reg_dt, with the following hyperparameters: min_samples_leaf = 3 and min_samples_split = 9.
  • Fit the regressor to the training set using .fit().
  • Predict the labels of the test set using .predict().