1. What is a hierarchical choice model?
Let's talk about what a hierarchical choice model is, why they are important, and how you can fit one with mlogit().
2. Heterogeneity in preferences
Hierarchical choice models account for differences in preferences between decision makers. Nearly everyone likes chocolate, but we like different types of chocolate. I strongly prefer white chocolate, but many other people dislike it. This concept is called heterogeneity.
If you believe that each person observed in your data may have different preferences, then you could consider estimating a choice model separately for each person. However, this usually isn't an option because we don't have enough data for each person. For example, the sportscar data only has 10 observed choices per person and we want to estimate a model with 5 parameters. If you try passing the data for one person to mlogit() you'll get an error.
3. Hierarchical choice models (random coefficients models)
Instead, we make an assumption that each person's coefficients are drawn from a distribution. Let me show you how this works with some pseudocode.
Here, I've written that each respondent has a vector beta of coefficients that comes from a multivariate normal distribution with mean beta_0 and covariance Sigma. The multivariate normal is a vector version of the normal distribution. Hierarchical models are also called random coefficients models because the coefficients are assumed to be randomly distributed across people.
Then for each choice task, the probability of each choice is computed following the logit model using the beta vector of coefficients for each person. The only difference between this code and the code for the standard multinomial logit is that we have to subset out the data for respondent i and task j.
To summarize, this type of model draws each person's coefficients from a distribution and then uses them in the logit model. We call it a hierarchical model because it stacks together an upper-level multivariate normal model for the coefficients with a lower-level multinomial logit model for the choices.
4. Fitting a hierarchical multinomial logit model
Fitting a hierarchical multinomial logit model using mlogit() is easy. It only requires a few extra inputs to mlogit-dot-data() and mlogit().
When we want to fit a hierarchical model we add the id-dot-var input to mlogit-dot-data(). This input should give the name of the column in the data that identifies the decision maker for each choice. In sportscar, this variable is called resp_id.
Second, when we call mlogit(), we have two additional parameters. The first, called rpar, is a vector where the name of each element is one of the coefficients and the value is "n". In our case, we've set rpar to price equals "n", which tells mlogit() that we want the price parameter to be normally distributed across people. The second is panel equals TRUE, which tells mlogit() that we have repeated choices for each person.
Everything else - the formula and the data - is exactly as it was before.
5. Hierarchical model coefficients
When we summarize the hierarchical model, we can see that there is a new parameter called sd-dot-price. sd-dot-price is the log of the standard deviation of the normal distribution that mlogit() estimated for price. It tells us how much heterogeneity there is in price preferences among sports car buyers.
We can also see the quartiles of the distribution of the price parameters in the table at the bottom labeled random coefficients.
6. Distribution of the price coefficient
We can also look the distribution of the price coefficient by typing plot(m7). This calls the special plotting function for mlogit() objects which gives us a density plot of the random coefficient. We can see that the price coefficients range from about -0-point-24 to -0-point-10. That means some people are twice as price sensitive as others when choosing sports cars.
7. Let's practice!
I bet there is a lot of heterogeneity in preferences for chocolates! Let's fit some hierarchical choice models to find out how much.