Get startedGet started for free

Dropping and moving columns

1. Dropping and moving columns

Welcome to chapter two! We know how to select several columns from our DataFrame, but what if we want to keep all columns except for one or two?

2. How to select most columns

One possibility would be to use the select function and type the names of all columns except for the ones we want to drop. However, with large datasets and complicated names, this quickly becomes bothersome. And mistakes can happen easily. Fortunately, there is a better solution in Julia, the Not operator.

3. Not() operator

Let's consider the chocolates DataFrame. We are interested in most of the columns, except the review-date column. To do that using the Not operator, we can call select, inside which we pass the chocolates DataFrame and a call of Not, where we specify the column.

4. select() and not() pitfalls

This approach fails if the column we want to drop is not present in the DataFrame and we get an error. This might occur, for example, if we already removed the column previously. While the error help in catching typos, it can slow us in more complicated code. Luckily, there are two solutions to this.

5. Dropping column that is not there safely

Firstly, we can use select-parenthesis-chocolates, comma, Not-parenthesis Cols-parenthesis equals-equals-parenthesis, and the name of our column as a string. The second option is to use just the Not operator along with regex. In our case, we can call select and pass the chocolates DataFrame followed by a call of Not, inside which we provide "r" and a string containing review-backslash-underscore-date. Special characters such as underscore cannot be used via regex directly; we have to precede them with an underscore.

6. Reordering columns

A similar problem to dropping columns is reordering them. We can again do it by writing the columns in the desired order inside a select statement. But if we want to move one or two columns to the left or to the right, there is a much better solution, using a colon. Let's look at the chocolates DataFrame, where we pre-selected a few columns so that everything fits on the screen. We would like to keep all the columns but we want to rearrange them for better readability.

7. Move it to the left

First, we want to move the cocoa column to the left, while keeping the other order intact. To do that, we can call select and pass chocolates, cocoa and a colon.

8. Move them to the left

To move both the cocoa and rating columns to the left, we just include rating between cocoa and the colon.

9. Move it right

When we want to move a column to the end, we can use the Not operator. Here we are moving the company column to the end by calling select and passing chocolates, a call of Not where we pass company, and finally company. If we don't mention the company column again, it would be dropped.

10. Move them all around

We can mix these two approaches to get the order we want.

11. Move and drop at the same time

We can also combine reordering with what we know about dropping columns. That way, we can easily reorder and trim down a DataFrame in one select statement. Here we have an example of how it can look when we put things together. We are reordering the columns while also dropping the review-underscore-date column.

12. Cheat sheet

Here is a cheat sheet for you, summarizing all this information.

13. Let's practice!

Are you ready to mix things up? Let's head to the exercises!

Create Your Free Account

or

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.