In this chapter, we focus on visualizing proportions of a whole; we see that pie charts really aren't so bad, along with discussing the waffle chart and stacked bars for comparing multiple proportions.

Grammar of Graphics intro

Familiarizing with disease data

Warming up data-wrangling

The pie chart and its friends

The infamous P-I-E

Cleaning up the pie

How about a waffle?

When to use bars

Basic stacked bars

Ordering stack for readability

Categorical x-axis

Proportions of a whole

We shift our focus now to single-observation or point data and go over when bar charts are appropriate and when they are not, what to use when they are not, and general perception-based enhancements for your charts.

Bars and dots: point data

Are bars appropriate?

Working with geom_col

Wrangling geom_bar

Point charts

Ordered point chart

Adding visual anchors

Faceting to show structure.

Tuning your charts

Let's flip some axes

Cleaning up the bars

Converting to point chart

Point data

We now move on to visualizing distributional data, we expose the fragility of histograms, discuss when it is better to shift to a kernel density plots, and how to make both plots work best for your data.

Importance of distributions

Orienting with the data

Looking at all data

Changing y-axis to density

Histogram nuances

Adjusting the bin numbers

More bars

Bin width by context

The kernel density estimator

Histogram to KDE

Putting a rug down

KDE with lots of data

Single distributions

Finishing off we take a look at comparing multiple distributions to each other. We see why the traditional box plots are very dangerous and how to easily improve them, along with investigating when you should use more advanced alternatives like the beeswarm plot and violin plots.

Intro to comparing distributions

A simple boxplot

Adding some jitter

Faceting to show all colors

Beeswarms and violins

Your first beeswarm

Fiddling with a violin plot

Violins with boxplots

Comparing lots of distributions

Comparing spatially-related distributions

A basic ridgeline plot

Cleaning up your ridgelines

Making it rain (data points)

Wrap-up

Comparing distributions

World Health Organization Disease Dataset

This course will help you take your data visualization skills beyond the basics and hone them into a powerful member of your data science toolkit. Over the lessons we will use two interesting open datasets to cover different types of data (proportions, point-data, single distributions, and multiple distributions) and discuss the pros and cons of the most common visualizations. In addition, we will cover some less common alternatives visualizations for the data types and how to tweak default ggplot settings to most efficiently and effectively get your message across.

Introduction to Data Visualization with ggplot2

Learn to  effectively convey your data with an overview of common charts, alternative visualization types, and perception-driven style enhancements.

Visualization Best Practices in R

Data Visualization  in R

Likely to Recommend

Changing y-axis to density

“Visualization Best Practices in R”

Exercise instructions

Hands-on interactive exercise

Visualization Best Practices in R

Chapter 1: Proportions of a whole

Chapter 2: Point data

Chapter 3: Single distributions

Chapter 4: Comparing distributions

What is DataCamp?