1. Replace values using dictionaries
Great job on using lists to replace values! Let's now discuss how we can use dictionaries to replace values in a pandas DataFrame efficiently.
2. Replace single values with dictionaries
Similarly to lists, we can use a different data structure to map values we want to replace to the ones they should be replaced with. Dictionaries are a valuable arrow in your Python's quiver, and they will serve our purpose well.
We're going to use dictionaries to replace every male's gender to BOY and every female's gender to GIRL. The syntax is very simple: we map each value we want to replace to the value we want to replace it with, using the colon symbol.
We could do the same thing with lists, but it's a more verbose. If we compare both methods, we can see that dictionaries run approximately 55% faster.
In general, working with dictionaries in Python is very efficient compared to lists: looking through a list requires a pass in every element of the list, while looking at a dictionary directs instantly to the key that matches the entry. The comparison is a little unfair though, since both structures serve different purposes.
3. Replace multiple values using dictionaries
Using dictionaries allows you to replace with the same values on several different columns.
In all the previous examples, we specified the column from which the values to replace came from. We're now going to replace several values from a same column with one common value.
We want to classify all ethnicities into three big categories: Black, Asian and White.
The syntax again is very simple. We use nested dictionaries here: the outer key is the column in which we want to replace values. The value of this outer key is another dictionary, where the keys are the ethnicities to replace, and the values for the new ethnicity (Black, Asian or White).
4. Let's do it!
Now that you know almost everything about the .replace() function, let's practice!