Let's use the decision tree that you trained in the first exercise. The tree predicts whether a loan applicant will default on their loan (or not).
Assume we have a loan applicant who:
- is applying for a 20-month loan
- is requesting a loan amount that is 2% of their income
- is 25 years old
After following the correct path down the tree for this individual's set of data, you will end up in a "Yes" or "No" bucket (in tree terminology, we'd call this a "leaf") which represents the predicted class. Ending up in a "Yes" leaf means that the model predicts that this individual will default on their loan, where as a "No" prediction means that they will not default on their loan.
Starting with the top node of the tree, you must evaluate a query about a particular attribute of your data point (e.g. is
months_loan_duration < 44?). If the answer is yes, then you go to the left at the split; if the answer is no, then you will go right. At the next node you repeat the process until you end up in a leaf node, at which point you'll have a predicted class for your data point.
According to the model, will this person default on their loan?