Get startedGet started for free

Data for survival analysis

In the following exercises you are going to work with data about customers of an online shop in order to practice survival analysis. But now it's not about the time until churn, but about the time until the second order.

The data is stored in the object dataNextOrder. The variable boughtAgain takes the value 0 for customers with only one order and 1 for customers who have placed a second order already. If a person has ordered a second time, you see the number of days between the first and second order in the variable daysSinceFirstPurch. For customers without a second order, daysSinceFirstPurch contains the time since their first (and most recent) order.

The ggplot2 package is already loaded to your workspace.

This exercise is part of the course

Machine Learning for Marketing Analytics in R

View Course

Exercise instructions

  • Take a look at the data using head().
  • Plot a histogram of the days since the first purchase separately for customers with vs. without a second order. (If you're not used to ggplot2 code, don't worry: You just have to use the daysSinceFirstPurch as x variable and boughtAgain as fill and facet variable.)

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Look at the head of the data
___(dataNextOrder)

# Plot a histogram
ggplot(dataNextOrder) +
  geom_histogram(aes(x = ___,
                     fill = factor(___))) +
  facet_grid( ~ boughtAgain) + # Separate plots for boughtAgain = 1 vs. 0
  theme(legend.position = "none") # Don't show legend
Edit and Run Code