Computing Gini index
The decision tree algorithm aims to achieve partitions in the terminal nodes that are as pure as possible. The Gini index is one of the methods used to achieve this. It is calculated based on the proportion of samples in each group.
Given the number of people who stayed and left respectively, calculate the Gini index for that node.
This exercise is part of the course
HR Analytics: Predicting Employee Churn in Python
Exercise instructions
- Calculate the total number of employees in that node.
- Compute the Gini index based on proportion of employees in each group.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
#number of people who stayed/left
stayed = 37
left = 1138
#sum of stayed and left
total = ____ + ____
#gini index
gini = ____*(____/total)*(____/total)