Search⌘ K
AI Features

Boxplots

Explore how boxplots visually summarize numerical data distributions using the five-number summary and interquartile range. Learn to create and interpret side-by-side boxplots in R with ggplot2 to compare data across categories and identify outliers.

While faceted histograms are one type of visualization used to compare the distribution of a numerical variable split by the values of another variable, another type of visualization that achieves this same goal is a side-by-side boxplot. A boxplot is constructed from the information provided in the five-number summary of a numerical variable.

Five-number summary

The five-number summary consists of five summary statistics: the minimum, the first quartile (25th percentile), the second quartile (median or 50th percentile), the third quartile (75th percentile), and the maximum.

The quartiles are calculated as:

  • The first quartile (Q1Q_1): The median of the first half of the sorted data

  • The third quartile (Q3Q_3 ...