Get startedGet started for free

Predicting with the Cox PH model

1. Predicting with the Cox PH model

Using the Cox PH model, we could analyze existing data and predict survival time for new data.

2. Predict median survival times

Once we have fitted the model on the training data, the predict_median method predicts the median lifetimes for subjects in the new data. If the survival curve of a subject does not cross 0-point-5, then the result is infinity. The predict_median method takes two parameters, X, which is the new DataFrame, and conditional_after, which is an array or list of values that indicate how long the subjects have already lived for.

3. Predict median survival times

The predict_median method returns a pandas Series of predictions for each subject in the DataFrame X. When we specify the conditional_after parameter, all predictions made are conditional on subjects already having survived for the specified time.

4. Predict the survival function

The predict_survival_function method predicts the survival function for subjects in the new data, given their covariates. The predict_survival_function method takes the same parameters as the predict_median function: X, which is the new DataFrame, and conditional_after, which is an array or list of values that indicate how long the subjects have already lived for.

5. Predict the survival function

The predict_survival_function method returns a pandas DataFrame of predictions for each subject. Each row of the DataFrame is one time point on the survival function, and each column is a subject. When we specify the conditional_after parameter, the timeline is adjusted to start at the last observed time of each subject. The new timeline is the remaining duration of the subject. You might wonder why such a prediction table is useful. There are many applications in real life, for example, in a machine failure analysis, we might want to predict the most likely time each piece should be replaced or inspected to proactively prevent failure. Another example is forecasting. Say we are predicting employee churn, we could forecast the business employee count based on expected survival rates.

6. Key steps

There are some key steps to getting a prediction model right. First, we must preprocess the data and check each column for categorical values. If the covariate is categorical, we should one-hot encode it as 0 or 1. Second, we will split the data into a train dataset and a test dataset. A common split ratio is 80 20. Make sure to check that the proportions of censored data are similar between train and test. When the data is ready, we fit a Cox PH model to the training data.

7. Let's practice!

Let's practice making predictions with the Cox PH model!