1. The log-rank test
Having learned how to compare survival curves using the Kaplan-Meier model, we will now study how to test whether the distributions are identical scientifically.
2. Hypothesis testing
Hypothesis testing is standard practice for statistical inference. We test whether our hypotheses are valid by figuring out the odds that the results have happened by chance.
For example, say our null hypothesis is that California and Nevada's residents have the same income. The alternative hypothesis is that they don't have the same income. We use these states' income data to calculate a p-value. The p-value describes the likelihood of our observed income data occurring if the null hypothesis were true.
A very small p-value allows us to reject the null hypothesis.
3. Log-rank hypothesis testing
The log-rank test tests the null hypothesis that there is no difference in survival between 2 or more independent groups.
Survival curves for each group are estimated separately using the Kaplan-Meier estimator, and their survival probabilities are compared.
For 2 groups A and B, the null hypothesis is that they have identical survival curves. The alternative hypothesis is that they have different survival curves.
The p-value from the log-rank test is the likelihood of getting our data if the null hypothesis were true. P-values such as point-05 and point-01 are common to use as thresholds for rejecting the null hypothesis.
4. Running the log-rank test
To run a log-rank test, we import the logrank_test() function from the lifelines package statistics module.
This function has 4 required parameters. Durations_A is a list of event durations for the first population. Durations_B is a list of event durations for the second population. Event_observed_A is a list of censorship flags for the first population. Event_observed_B is a list of censorship flags for the second population.
The function returns a StatisticalResult object. One useful method is print_summary, which prints all the test results. We can also use the p_value and the test-statistic attributes to access these properties directly.
5. Log-rank test example
Let's see an example.
We're testing a program for babies and whether it changes when they speak their first words.
DataFrame t contains data from babies in the speech program, and DataFrame c contains data from the control group. The duration columns are the durations between babies' births and their first words. The observed columns are whether speech has been observed. How would we run a log-rank test? Feel free to pause the video to think.
To run the log-rank test, we call logrank_test() with treatment and control group data and assign a variable name to the StatisticalResult object. Calling the print_summary method, we see that the p-value is point-77, which is larger than point-05. Therefore we fail to reject the null hypothesis.
6. Keep in mind...
The log-rank test is non-parametric and does not assume the shape of the data.
However, its implementation in the lifelines package only applies to right-censored time-to-event data. This means that we haven't observed the event in some subjects when the measurement period ends. For example, subject 3 in this chart is right-censored because its event is after the end of our observation. As for all survival analysis tools, the censorship should be non-informative, meaning that it is not related to survival durations and outcomes.
Lastly, to run a log-rank test for n greater than 2 groups, use the pairwise_logrank_test function or the multivariate_logrank_test function. Pairwise_logrank_test compares all possible pairs of survival curves.
7. Let's practice!
We just learned what the log-rank test is and how to use it. Now let's practice!