1. Choice data in two files
Another way you will sometimes get choice data is in two separate files.
2. Choice data in two files
For example, I've loaded the sportscar data in two separate data frames. The first data frame contains the descriptions of the alternatives in each question. But, unlike regular long-format choice data, sportscar_alts does not include the data on which choice each respondent chose. That's in another file called sportscar_choices.
sportscar_choices has one row for each observed choice with columns telling us the respondent id (resp_id), the question number, the customer segment that the respondent belongs to and the choice. We have to look back at sportscar_alts to find out that the chosen alternative - alternative 3 - was a five-seat automatic non-convertible for thirty-thousand dollars.
This type of format is common when the choices you observe are purchases at a store. The transactions, which tell us what product each consumer purchased, will be in one file similar to sportscar_choices and the products that were available at the time of each purchase will be in another file similar to sportscar_alts.
And it can get even more complicated - sometimes the characteristics of the decision makers are in another file with one row for each respondent. But let's focus on the case of just two files for now.
3. Merging the two files
We can merge the two files together using the merge() function. The syntax for merge() is pretty straightforward. The first two inputs are the names of the data frames you want to merge. In this case, we are merging sportscar_choices and sportscar_alts. The by input is a list of the names of the columns that you want to match between the two files. In this case, we want to match on the resp_id and the question number.
By default, merge() will include all of the rows in both files, similar to an outer join in SQL. So, what we end up with is a table with the same rows as sportscar_alts - one row for each alternative in each question for each respondent - but we now have the choice and segment columns merged in from sportscar_choices.
We still might want to sort sportscar and convert the choice column to a logical indicator, but I've already done that before in the video on "Converting from wide to long", so I won't show it to you again.
4. Let's practice!
Now it's your turn to try merging the chocolate choices that I've stored in two different files.