1. Basic visualizations
Data visualization is a powerful tool in the hands of data professionals. But you need to master it to unlock its full power.
Let's explore common charts and plots for the different use cases and draw some key principles.
2. Parts of a whole
The pie chart you have used so far is a good fit for displaying the parts of a whole. In this case, it shows the share of a category compared to the total.
3. Explore distributions
If you're plotting the distribution of a numeric variable, you can use a histogram. It shows the distribution across different intervals, called bins. In this example, you can see that most people in the dataset are, on average, 35 years old.
4. Explore distributions
You can change the number of bins to have a more granular representation, giving a more precise idea of the average age.
5. Explore relations
Bar charts are among the easiest yet most used charts to plot categorical data. The length of the bar represents the numerical value associated with the category, for example, its number of occurrences or the sum of the expenses for each department, like in this case.
6. Explore relations
If you want to find relationships between two numeric variables, instead, you can use a scatter plot. Each dot represents a data point, and you can observe at a glance if there is a correlation between the two variables.
7. Explore relations
You can color the dots to introduce a third variable
8. Explore relations
and plot the combination of multiple variables in a scatter plot matrix.
9. Explore relations
Another useful plot for exploring relations is the parallel coordinates plot. It represents data as lines and intersects the different values on the vertical lines. Here, you can also easily see if there is a positive or negative correlation between two variables. However, only subsequent variables are easily comparable.
10. Keep it simple
After choosing the appropriate chart, it is good to remember some key data visualization principles.
An effective visualization communicates the information clearly and simply, avoiding unnecessary complexity. The importance here is the trade-off with the reader's cognitive load: the more complex the visualization becomes, the more the reader has to process.
11. Titles and labels
Make sure to label the axis and add an informative title to give context.
12. Axis scale
Be accurate and consistent through all visualizations to avoid misleading information. These two bar charts show the same data, but since the axis of the right one does not start at zero, it looks like the HR department almost had no expenses at all.
13. Colors
Lastly, use colors wisely. They can be used strategically to highlight or group similar elements, but they can become distracting if overused.
14. Let's practice!
Now that you have seen some graphs and best practices, let's recap what you have learned.