Collective Inferencing
Collective inferencing is a procedure to simultaneously label nodes in interconnected data to reduce classification error.
In this exercise you will perform collective inferencing and see the effect it has on the churn prediction using the AUC performance measure. AUC, or area under the ROC curve, is commonly used to assess the performance of classification techniques.
- AUC = probability that a randomly chosen churner is ranked higher by the model than a randomly chosen non-churner
- AUC = number between 0.5 and 1, where a higher number means a better model
Does collective inferencing increase the AUC value?
This exercise is part of the course
Predictive Analytics using Networked Data in R
Exercise instructions
- Compute the AUC of the relational neighbor classifier by calling the
auc
function in thepROC
package, using the actual churn labelscustomers$churn
and thechurnProb
as the predicted value. - Write a
for
loop where you apply the probabilistic relational neighbor classifier ten times, and assign the value again to thechurnProb
vector in each iteration. - Compute the AUC again using the updated
churnProb
vector.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Load the pROC package and data
library(pROC)
load("Nex132.RData")
# Compute the AUC
___(customers$churn, as.vector(churnProb))
# Write a for loop to update the probabilities
___(i in 1:10){
___ <- as.vector((AdjacencyMatrix %*% churnProb) / neighbors)
}
# Compute the AUC again
___(customers$churn, as.vector(___))