What Makes an Important Product?
Now that we've come up with a working definition of an important product, let's see if they have any properties that might be correlated. One candidate pairing is salesrank.from
and salesrank.to
. We can ask if important products tend to have higher sales ranks than the products people purchase downstream. We'll look at this by first subsetting out the important vertices, joining those back to the initial dataframe, and then creating a new dataframe using the package dplyr
. We'll create a new graph, and color the edges blue for high ranking (1, 2, 3) to low ranking (20, 21, 22) and red for the opposite. If rank is correlated with downstream purchasing, then we'll see mostly blue links, and if there's no relationship, it will be about equally red and blue.
The dataset ip_df
contains the information about important products.
This exercise is part of the course
Case Studies: Network Analysis in R
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Select the from and to columns from ip_df
ip_df_from_to <- ip_df[c(___,___)]
# Create a directed graph from the data frame
ip_g <- graph_from_data_frame(___, directed = ___)
# Set the edge color. If salesrank.from is less than or
# equal to salesrank.to then blue else red.
edge_color <- ifelse(
ip_df$___ <= ip_df$___,
yes = ___,
no = ___
)