Get startedGet started for free

Setting and viewing data.table keys

1. Setting and viewing data.table keys

In this lesson, you will learn how to set and view the keys of a data table.

2. Setting `data.table` keys

In the previous lesson you learned how to perform joins using the data table syntax. In each case, you had to use the on argument to specify how to match rows between the two data tables. However, it is possible to tell R which columns are keys for each data table in advance of a join, removing the need for the on argument. This is useful if you find yourself performing several different joins with a single data table. Setting a key will also sort a data table by that column in memory, which makes joining and filtering operations on that columns much faster for large data tables. With that in mind, its useful to know you can set multiple key columns for a single data table. You'll learn more about joins that require multiple keys in the next chapter.

3. The `setkey()` function

The setkey() function is used for this purpose. It takes a single data table as its first argument, then any number of key column names as its remaining arguments. These can be entered as if they were variables, or can be wrapped inside quotes, either will work. If you don't provide any column names to the setkey() function, it will use all columns of the data table as its keys!

4. The `setkey()` function

When keys are set for two data tables, you can use the data table syntax without the on argument for performing joins.

5. Setting keys programmatically

You can also use the setkeyv() function to set the keys of a data table by passing in a character vector of the key column names. This is useful if you want to set the keys of a data table programmatically, where your key column names are stored in another variable.

6. Getting keys

You can check whether a data table has any key columns set by using the haskey() function, and get the key you've set by using the key() function.

7. Getting keys

If you haven't set the key for a data table then the haskey() function will return FALSE and the key() function will return NULL.

8. Viewing all `data.tables` and their keys

The tables() function you learned about in the very first lesson will also show you the keys you have set for any data tables in your R session.

9. Let's practice!

Now it's your turn to play with keys.