Search⌘ K
AI Features

Continuous vs Categorical Bivariate Analysis: Boxplot & Histogram

Explore bivariate analysis techniques that compare continuous and categorical variables using boxplots and histograms. Learn to create grouped and colored visualizations with Plotly Express and Graph Objects, enhancing data interpretation skills with real-world examples like customer churn data.

Bivariate analysis: Bivariate histograms

In this section, we will discuss the ways in which we can analyze a continuous variable and a categorical variable together.

Bivariate analysis

Bivariate analysis is a statistical technique that encompasses exploring the relationship between two variables. Bivariate analysis can be implemented when a variable is continuous, and another is categorical, in which we are then able to determine if there is a difference in the distribution of the continuous variable for each category of the categorical variable.

For the next few examples, we will use a customer churn dataset that details whether individuals have left a company. The dataset details important features of the customer, such as their age, credit score, place of residence, etc.

Python 3.8
# Import libraries
import pandas as pd
# Import dataset
churn = pd.read_csv("/usr/local/csvfiles/churn.csv")
# Look at data
print(churn.head())

Bivariate box plots

We will expand our work from the first chapter and, this time, group our plots by category. We will start by ...