Computing association and dissociation
The library has returned to you once again about your recommendation to promote Harry Potter using Twilight. They're worried that the two might be dissociated, which could have a negative impact on their promotional effort. They ask you to verify that this is not the case.
You immediately think of Zhang's metric, which measures association and dissociation continuously. Association is positive and dissociation is negative. As with the previous exercises, the DataFrame books
has been imported for you, along with numpy
under the alias np
. Zhang's metric is computed as follows:
$$Zhang(A \rightarrow B) = $$ $$\frac{Support(A \& B) - Support(A) Support(B)}{ max[Support(AB) (1-Support(A)), Support(A)(Support(B)-Support(AB))]}$$
This exercise is part of the course
Market Basket Analysis in Python
Exercise instructions
- Compute the support of {Twilight} and the support of {Potter}.
- Compute the support of {Twilight, Potter}.
- Complete the expression for the denominator.
- Compute Zhang's metric for {Twilight} \(\rightarrow\) {Potter}.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Compute the support of Twilight and Harry Potter
supportT = books['Twilight'].____
supportP = books['Potter'].____
# Compute the support of both books
supportTP = ____.mean()
# Complete the expressions for the numerator and denominator
numerator = supportTP - supportT*supportP
denominator = ___(supportTP*(1-supportT), supportT*(supportP-supportTP))
# Compute and print Zhang's metric
zhang = ____ / ____
print(zhang)