Performance estimation for tip prediction
In the previous exercises, you prepared a reference and analysis set for the NYC Green Taxi dataset. In this one, you will use that data to estimate the model's performance in production.
First, you must initialize the DLE algorithm with the provided parameters and then plot the results.
The reference and analysis set is already loaded and saved in the reference
and analysis
variables.
Additionally, nannyml
is also already imported.
This exercise is part of the course
Monitoring Machine Learning in Python
Exercise instructions
- Initiate the DLE algorithm with daily chunk period,
tip_amount
as ay_true
, and MSE metric. - Fit
reference
set to the DLE estimator, estimate performance for analysis set and store the output in theresults
variable. - Visualize the results using
plot()
andshow()
methods.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
estimator = nannyml.DLE(y_pred='y_pred',
timestamp_column_name='lpep_pickup_datetime',
feature_column_names=features,
chunk_period='d',
y_true='tip_amount',
metrics=['mse'])
# Fit the reference data to the DLE algorithm
estimator.____(____)
# Estimate the performance on the analysis data
results = estimator.____(____)
# Plot and show the results
____.____().____()