Business calculation for hotel booking dataset
Previously, you were introduced to the challenge of predicting booking cancellations. Here, you will work with the actual Hotel Booking dataset, where a model predicts booking cancellations based on the customer's country of origin, time between booking and arrival, required parking spaces, and the chosen hotel.
The reference and analysis sets have already been loaded for you. Here are the first two rows:
country lead_time parking_spaces hotel y_pred y_pred_proba is_canceled timestamp
0 FRA 120 0 City Hotel 0 0.239983 0 2016-05-01
1 ITA 120 1 City Hotel 0 0.003965 0 2016-05-01
Your task is to check the model's monetary value and ROC AUC performance.
This exercise is part of the course
Monitoring Machine Learning in Python
Exercise instructions
- Initialize a custom threshold with 0 as the lower value and 150,000 as the upper value.
- Specify the business value and
roc_aucmetric for monitoring. - Set
TNto 0,FPto -100,FNto -200, andTPto 1500 inbusiness_value_matrix. - Assign custom threshold to the business value metric.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Custom business value thresholds
ct = ConstantThreshold(____=____, ____=____)
# Intialize the performance calculator
calc = PerformanceCalculator(problem_type='classification_binary',
y_pred_proba='y_pred_proba',
timestamp_column_name="timestamp",
y_pred='y_pred',
y_true='is_canceled',
chunk_period='m',
metrics=[____, ____],
business_value_matrix = [[____, ____],[____, ____]],
thresholds={____: ____})
calc = calc.fit(reference)
calc_res = calc.calculate(analysis)
calc_res.filter(period='analysis').plot().show()