How can we visually compare two plots using Matplotlib?
In the realm of data analysis and visualization, the ability to compare and contrast different datasets is crucial for gaining insights and making informed decisions. In this Answer, we will explore how we can visually compare two plots using matplotlib.
matplotlib is a popular Python library for creating static interactive and animated plots. It provides powerful tools for visualizing data, and we'll go over a few of them.
Need for visual comparison
Comparing various plots lies at the heart of data analysis. It allows us to discern patterns, trends, and anomalies within datasets, enabling informed decision-making. Whether it's tracking the performance of different products, assessing the impact of variables over time, or understanding the correlation between datasets, the ability to compare visually is crucial for extracting actionable insights.
Common ways to compare plots with matplotlib
Let's explore several effective methods within matplotlib that empower data analysts to visually compare datasets, each offering its own advantages depending on the nature of the data and the goals of the analysis.
Subplots: We can create distinct subplots for each dataset side by side using
plt.subplots().Shared x-axis: We can create subplots that share the x-axis but have their own set of y-axis using
plt.subplots()withsharex=True.Dual-axis approach: We can overlay two plots on the same set of axes but with different y-axes using
plt.twinx(). This can be particularly useful when the datasets have different scales but share a common x-axis.
Examples
To better understand each of the above-written visualization techniques, let's see a few examples.
Subplots
In this example, we demonstrate how to create a simple side-by-side comparison of monthly sales for two products using plt.subplots. The data is randomly generated, and the resulting plot shows the trend in sales for each product over the 12 months of the year.
import matplotlib.pyplot as pltimport numpy as npmonths = np.arange(1, 13)sales_product_a = np.random.randint(50, 200, size=12)sales_product_b = np.random.randint(50, 200, size=12)fig, axes = plt.subplots(1, 2, figsize=(12, 4))axes[0].plot(months, sales_product_a, color='blue', marker='o', label='Product A')axes[0].set_title('Monthly Sales - Product A')axes[1].plot(months, sales_product_b, color='green', marker='s', label='Product B')axes[1].set_title('Monthly Sales - Product B')for ax in axes:ax.set_xlabel('Month')ax.set_ylabel('Sales')ax.legend()# Adjust layout for better appearanceplt.tight_layout()# Show the plotplt.show()plt.savefig("output/graph.png")
Code explanation
Lines 4–6: Create an array
monthsrepresenting the 12 months of a year and then generate random sales data for two products (A and B) using NumPy'srandintfunction.Line 8: Create a figure
figand a set of subplotsaxesarranged in one row and two columns. Set the figure size to12x4inches.Lines 10–14: Plot the sales data for Product A on the left subplot and Product B on the right subplot. Different markers and colors are used for clarity. Titles are set for each subplot.
Lines 16–19: Axis labels are set for both x and y axes, and legends are added to identify the products.
Shared x-axis
In this example, we create a vertically stacked two-subplot visualization comparing monthly temperature and precipitation data. The x-axis is shared between the subplots for better comparison.
import matplotlib.pyplot as pltimport numpy as np# Sample datamonths = np.arange(1, 13)temperature_data = np.random.uniform(10, 30, size=12)precipitation_data = np.random.uniform(0, 100, size=12)# Create subplots with a shared x-axisfig, (ax1, ax2) = plt.subplots(2, 1, sharex=True, figsize=(8, 6))# Plot temperature data on the first subplotax1.plot(months, temperature_data, color='orange', marker='o', label='Temperature')ax1.set_ylabel('Temperature (°C)')ax1.set_title('Monthly Temperature and Precipitation')# Plot precipitation data on the second subplotax2.plot(months, precipitation_data, color='blue', marker='s', label='Precipitation')ax2.set_xlabel('Month')ax2.set_ylabel('Precipitation (mm)')# Display legendsax1.legend()ax2.legend()# Adjust layout for better appearanceplt.tight_layout()# Show the plotplt.show()plt.savefig("output/graph.png")
Code explanation
This code widget is similar to the previous one, so let's examine what has changed.
Line 10: Create a figure with two vertically stacked subplots. The
sharex=Trueparameter ensures that both subplots share the same x-axis.
Dual-axis approach
In this example, we plot the first dataset, y1, on the left y-axis, and the second dataset, y2, on the right y-axis.
import matplotlib.pyplot as pltimport numpy as np# Generate sample datax = np.linspace(0, 10, 100)y1 = np.sin(x)y2 = 2 * np.cos(x)# Plot the first datasetplt.plot(x, y1, color='blue', label='Dataset 1')plt.xlabel('X-axis')plt.ylabel('Dataset 1', color='blue')# Create a twin Axes sharing the xaxisax2 = plt.twinx()ax2.plot(x, y2, color='green', label='Dataset 2')ax2.set_ylabel('Dataset 2', color='green')# Display legendsplt.legend(loc='upper left')ax2.legend(loc='upper right')# Display the plotplt.title('Visual Comparison of Two Datasets')plt.show()plt.savefig("output/graph.png")
Code explanation
Lines 10–12: Plot the first dataset,
y1, with a blue color. Set labels for the x-axis and y-axis for the first dataset.Line 15: Create a twin axes,
ax2, that shares the same x-axis with the original plot usingplt.twinx(). This allows two different y-axes to be plotted on the same x-axis.Lines 16–17: Plot the second dataset,
y2, on the twin axes with a green color. Set the y-axis label for the second dataset.
matplotlib provides a rich set of tools for visually comparing plots, empowering data analysts to unearth insights and make informed decisions. Whether through subplots, dual-axis representation, or shared axes, the library offers flexibility to tailor visualizations to the unique characteristics of diverse datasets.
Free Resources