Visualization with Box Plots
Explore how to create and customize box plots in Seaborn to visualize the distribution of categorical data. Learn to interpret components such as medians, quartiles, whiskers, and outliers. Understand how to use hue, adjust plot aesthetics, and compare distributions across multiple categories using real data from the Titanic dataset.
We'll cover the following...
Overview
A box plot (also called a whisker plot) shows the distribution of data, which allows us to compare and visualize the data across different categories. The box plot mainly consists of three parts: the median of data, the whiskers representing the inner quartile range of the data, and outliers. Outliers in the dataset imply those data points whose distribution is highly different from the rest of the data.
Plotting box plots
We’ll use the titanic dataset in this lesson—it’s loaded in the DataFrame named titanic_df. Let’s check the distribution of the age column by drawing a box plot using the sns.boxplot() function.
The box plot has several different components, as illustrated in the figure below:
Seaborn first computes the median of our data to plot a box plot. Once the median is located, a line is drawn ...