Collective Inferencing
Collective inferencing is a procedure to simultaneously label nodes in interconnected data to reduce classification error.
In this exercise you will perform collective inferencing and see the effect it has on the churn prediction using the AUC performance measure. AUC, or area under the ROC curve, is commonly used to assess the performance of classification techniques.
- AUC = probability that a randomly chosen churner is ranked higher by the model than a randomly chosen non-churner
- AUC = number between 0.5 and 1, where a higher number means a better model
Does collective inferencing increase the AUC value?
This exercise is part of the course
Predictive Analytics using Networked Data in R
Exercise instructions
- Compute the AUC of the relational neighbor classifier by calling the
aucfunction in thepROCpackage, using the actual churn labelscustomers$churnand thechurnProbas the predicted value. - Write a
forloop where you apply the probabilistic relational neighbor classifier ten times, and assign the value again to thechurnProbvector in each iteration. - Compute the AUC again using the updated
churnProbvector.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Load the pROC package and data
library(pROC)
load("Nex132.RData")
# Compute the AUC
___(customers$churn, as.vector(churnProb))
# Write a for loop to update the probabilities
___(i in 1:10){
___ <- as.vector((AdjacencyMatrix %*% churnProb) / neighbors)
}
# Compute the AUC again
___(customers$churn, as.vector(___))