Collective Inferencing

Collective inferencing is a procedure to simultaneously label nodes in interconnected data to reduce classification error.

In this exercise you will perform collective inferencing and see the effect it has on the churn prediction using the AUC performance measure. AUC, or area under the ROC curve, is commonly used to assess the performance of classification techniques.

AUC = probability that a randomly chosen churner is ranked higher by the model than a randomly chosen non-churner
AUC = number between 0.5 and 1, where a higher number means a better model

Does collective inferencing increase the AUC value?

This exercise is part of the course

Predictive Analytics using Networked Data in R

View Course

Exercise instructions

Compute the AUC of the relational neighbor classifier by calling the auc function in the pROC package, using the actual churn labels customers$churn and the churnProb as the predicted value.
Write a for loop where you apply the probabilistic relational neighbor classifier ten times, and assign the value again to the churnProb vector in each iteration.
Compute the AUC again using the updated churnProb vector.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Load the pROC package and data
library(pROC)
load("Nex132.RData")

# Compute the AUC
___(customers$churn, as.vector(churnProb))

# Write a for loop to update the probabilities
___(i in 1:10){
 ___ <- as.vector((AdjacencyMatrix %*% churnProb) / neighbors)
}

# Compute the AUC again
___(customers$churn, as.vector(___))

Edit and Run Code