How can we visually compare two plots using Matplotlib?

In the realm of data analysis and visualization, the ability to compare and contrast different datasets is crucial for gaining insights and making informed decisions. In this Answer, we will explore how we can visually compare two plots using matplotlib.

matplotlib is a popular Python library for creating static interactive and animated plots. It provides powerful tools for visualizing data, and we'll go over a few of them.

Need for visual comparison

Comparing various plots lies at the heart of data analysis. It allows us to discern patterns, trends, and anomalies within datasets, enabling informed decision-making. Whether it's tracking the performance of different products, assessing the impact of variables over time, or understanding the correlation between datasets, the ability to compare visually is crucial for extracting actionable insights.

Common ways to compare plots with matplotlib

Let's explore several effective methods within matplotlib that empower data analysts to visually compare datasets, each offering its own advantages depending on the nature of the data and the goals of the analysis.

Subplots: We can create distinct subplots for each dataset side by side using plt.subplots().
Shared x-axis: We can create subplots that share the x-axis but have their own set of y-axis using plt.subplots() with sharex=True.
Dual-axis approach: We can overlay two plots on the same set of axes but with different y-axes using plt.twinx(). This can be particularly useful when the datasets have different scales but share a common x-axis.

Examples

To better understand each of the above-written visualization techniques, let's see a few examples.

Subplots

In this example, we demonstrate how to create a simple side-by-side comparison of monthly sales for two products using plt.subplots. The data is randomly generated, and the resulting plot shows the trend in sales for each product over the 12 months of the year.

import matplotlib.pyplot as plt
import numpy as np
months = np.arange(1, 13)
sales_product_a = np.random.randint(50, 200, size=12)
sales_product_b = np.random.randint(50, 200, size=12)
fig, axes = plt.subplots(1, 2, figsize=(12, 4))
axes[0].plot(months, sales_product_a, color='blue', marker='o', label='Product A')
axes[0].set_title('Monthly Sales - Product A')
axes[1].plot(months, sales_product_b, color='green', marker='s', label='Product B')
axes[1].set_title('Monthly Sales - Product B')
for ax in axes:
    ax.set_xlabel('Month')
    ax.set_ylabel('Sales')
    ax.legend()
# Adjust layout for better appearance
plt.tight_layout()
# Show the plot
plt.show()
plt.savefig("output/graph.png")

Code explanation

Lines 4–6: Create an array months representing the 12 months of a year and then generate random sales data for two products (A and B) using NumPy's randint function.
Line 8: Create a figure fig and a set of subplots axes arranged in one row and two columns. Set the figure size to 12x4 inches.
Lines 10–14: Plot the sales data for Product A on the left subplot and Product B on the right subplot. Different markers and colors are used for clarity. Titles are set for each subplot.
Lines 16–19: Axis labels are set for both x and y axes, and legends are added to identify the products.

Shared x-axis

In this example, we create a vertically stacked two-subplot visualization comparing monthly temperature and precipitation data. The x-axis is shared between the subplots for better comparison.

import matplotlib.pyplot as plt
import numpy as np
# Sample data
months = np.arange(1, 13)
temperature_data = np.random.uniform(10, 30, size=12)
precipitation_data = np.random.uniform(0, 100, size=12)
# Create subplots with a shared x-axis
fig, (ax1, ax2) = plt.subplots(2, 1, sharex=True, figsize=(8, 6))
# Plot temperature data on the first subplot
ax1.plot(months, temperature_data, color='orange', marker='o', label='Temperature')
ax1.set_ylabel('Temperature (°C)')
ax1.set_title('Monthly Temperature and Precipitation')
# Plot precipitation data on the second subplot
ax2.plot(months, precipitation_data, color='blue', marker='s', label='Precipitation')
ax2.set_xlabel('Month')
ax2.set_ylabel('Precipitation (mm)')
# Display legends
ax1.legend()
ax2.legend()
# Adjust layout for better appearance
plt.tight_layout()
# Show the plot
plt.show()
plt.savefig("output/graph.png")

import matplotlib.pyplot as plt
import numpy as np
# Generate sample data
x = np.linspace(0, 10, 100)
y1 = np.sin(x)
y2 = 2 * np.cos(x)
# Plot the first dataset
plt.plot(x, y1, color='blue', label='Dataset 1')
plt.xlabel('X-axis')
plt.ylabel('Dataset 1', color='blue')
# Create a twin Axes sharing the xaxis
ax2 = plt.twinx()
ax2.plot(x, y2, color='green', label='Dataset 2')
ax2.set_ylabel('Dataset 2', color='green')
# Display legends
plt.legend(loc='upper left')
ax2.legend(loc='upper right')
# Display the plot
plt.title('Visual Comparison of Two Datasets')
plt.show()
plt.savefig("output/graph.png")