Predictions
Now you are going to predict the survival curve for a new customer from the Cox Proportional Hazard model you estimated before. The model is still available in the object fitCPH
.
The new customer is a woman and used a voucher in her first order (voucher = 1
). The order was placed 21 days ago and had a shopping cart value of 99.90 dollars. She didn't return the order (returned = 0
).
Remember: voucher
and returned
can have the values 0 or 1.
This exercise is part of the course
Machine Learning for Marketing Analytics in R
Exercise instructions
- Create a one-row dataframe called
newCustomer
with the new customer's characteristics listed in the assignment text above. - Predict the expected median time until the second order for this customer using
print()
and plot the predicted survival curve. - You are informed that due to database problems the gender was incorrectly coded: The new customer is actually a man. The dataframe
newCustomer
is copied into a dataframe callednewCustomer2
. Now go ahead and change the respective variable tomale
. - Recompute the predicted median with the corrected data
newCustomer2
. What changed?
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create data with new customer
___ <- data.frame(daysSinceFirstPurch = __, shoppingCartValue = ___, gender = "female", voucher = _, returned = _)
# Make predictions
pred <- survfit(fitCPH, newdata = ___)
print(___)
___(pred)
# Dataset is copied. Now correct the customer's gender there
newCustomer2 <- newCustomer
___$gender <- ___
# Redo prediction
pred2 <- ___(fitCPH, newdata = ___)
print(___)