How to create a bubble plot with Plotly Graph Objects in Python

Plotly Graph Objects is a Python library that provides a flexible and powerful way to create interactive data visualizations. A bubble plot in Plotly Graph Objects is a type of chart that extends the concept of a scatter plot by introducing a third dimension through the size of markers, or bubbles. It is particularly useful for visualizing three numerical variables in a two-dimensional space. In a bubble plot, each data point is represented by a marker, and the size of the marker corresponds to the value of the third variable.

Features of the bubble plot in Plotly Graph Objects

The following are some key features of bubble plots using Plotly Graph Objects:

  • The x and y coordinates: Like traditional scatter plots, bubble plots have x and y coordinates that determine the position of each point on the plot.

  • Marker size (bubble size): The size of each bubble or marker on the plot represents the value of a third variable. This variable is often referred to as the size variable, and it determines the area or diameter of the bubble.

  • Marker color: We can also assign colors to the markers based on a fourth variable. This allows us to visualize an additional dimension of data by mapping colors to data values.

  • Hover text: When we hover over a bubble, a tooltip or hover text can display additional information related to the data point. This is useful for providing context or details about each data point.

  • Legend: A legend is often included to provide a color scale or size scale that helps interpret the values associated with marker size and color.

  • Data scaling: Bubble plots often require appropriately scaling the size variable to make the bubble size differences visually meaningful. Plotly allows us to adjust the scaling factor to control how the size variable is mapped to marker size.

  • Customization: Plotly provides extensive customization options, allowing us to change marker shapes, adjust axis labels and titles, set plot titles, change background colors, and more.

  • Interactivity: We can make bubble plots interactive by adding features such as zooming, panning, and the ability to toggle the visibility of certain data groups.

  • Animations: We can create animated bubble plots to show changes in data over time or any other dimension. Animations can be controlled by specifying frames.

  • 3D bubble plots: While the classic bubble plot is in two dimensions, Plotly also supports 3D bubble plots where we can add a z-axis to represent a fourth variable, creating a 3D scatter plot with bubbles.

Syntax

The bubble plot syntax typically follows this structure:

import plotly.graph_objects as go
bubble_plot = go.Figure(go.Scatter(
x=x_data,
y=y_data,
mode='markers', # Set the mode to 'markers' for a scatter plot
marker=dict(
size=size_data, # Assign the size data to marker size
color=color_data, # Assign the color data to marker color
colorscale='Viridis', # Choose a color scale (optional)
showscale=True, # Show a color scale legend (optional)
colorbar=dict(title='Color Scale'), # Customize color scale title (optional)
opacity=0.7, # Adjust the marker opacity (optional)
),
text=text_data, # Assign hover text to data points
... # Other parameters
))
Syntax of the bubble plot

Parameters

The following are the key parameters for creating a bubble plot using Plotly Graph Objects:

  • x: An array or list of x-coordinates for the data points.

  • y: An array or list of y-coordinates for the data points.

  • mode: We set this to 'markers' to create a bubble plot. This indicates that we want to use marker symbols to represent data points.

  • marker: A dictionary that allows us to customize the appearance of the markers (bubbles). Within the marker dictionary, we can specify various subparameters, such as:

    • size: An array or list of size values for the markers, determining the size of each bubble.

    • color: An array or list of color values for the markers, allowing us to color the bubbles based on a variable.

    • colorscale: The color scale to be used for mapping values to colors. Common scales include Viridis, Plasma, Jet, etc. (optional).

    • showscale: We set to True to display a color scale legend (optional).

    • colorbar: A dictionary that allows us to customize the color scale legend, including its title and other attributes (optional).

    • opacity: The opacity of the markers, allowing us to control how transparent or opaque they appear (optional).

  • text: An array or list of hover text labels for each data point. This text will be displayed when we hover over a bubble.

  • hoverinfo: A string that specifies what information to display on hover. Common options include 'x', 'y', 'text', 'name', and more. We can combine these options as needed.

  • name: A string that provides a name or label for the traceA trace refers to a single data set or a specific visualization on a plot.. This is often used in legends when we have multiple traces on the same plot.

  • textposition: Specifies the position of the hover text with respect to the marker. Common options include 'top left', 'top center', 'top right', 'middle left', 'middle center', 'middle right', 'bottom left', 'bottom center', and 'bottom right'.

  • textfont: A dictionary that allows us to customize the font properties of the hover text, including attributes like family, size, color, etc.

  • textangle: Specifies the angle at which the hover text is displayed in degrees.

  • texttemplate: Allows us to create custom hover text templates using a text string with placeholders for data values. This is useful for formatting the hover text.

  • textsrc: An optional attribute that allows us to specify the source of hover text, particularly when working with subplots.

Return type

When creating a visualization like a bubble plot using the go.Scatter trace and the go.Figure constructor, we generate a figure object that represents our bubble plot. The figure object contains all the necessary information about our plot, including the data, traces, layout, and any additional settings or customizations we’ve applied. In the case of a bubble plot, we use the go.Scatter trace with the mode set to markers to indicate that we want to use marker symbols (bubbles) to represent our data points. The marker attribute within the trace allows us to customize the appearance of these bubbles, including their size and color. The figure object can also include hover text, legends, and other interactive features to make our bubble plot informative and visually appealing.

Implementation

In the following playground, we create a bubble plot using a sample dataset called gapminder provided by Plotly Express. The attributes of the gapminder dataset are defined as follows:

  • country: The name of the country for the data point.

  • continent: The continent to which the country belongs.

  • year: The year in which the data was recorded.

  • lifeExp: The life expectancy of the population in years.

  • pop: The population of the country.

  • gdpPercap: The gross domestic product (GDP) per capita, which is the economic output per person in USD.

  • iso_alpha: The ISO alpha-3 code representing the country.

The bubble plot visualizes GDP per capita by year, incorporating continent color encoding, using Plotly Graph Objects.

cd /usercode && python3 main.py
python3 -m http.server 5000 > /dev/null 2>&1 &
Visualization of GDP per capita by year with bubble plot

Explanation

The code above is explained in detail below:

  • Lines 1–3: We import the necessary modules: plotly.graph_objects for creating custom plots, plotly.express for simplified plotting and pandas for data manipulation.

  • Line 6: We load the gapminder dataset using Plotly Express’s built-in sample dataset.

  • Line 9: We print the first five rows of the loaded dataset using the head() method to inspect the data.

  • Line 12: We create an empty figure object named bubble_plot to which we’ll add traces for the bubble plot.

  • Lines 15–29: A loop iterates through unique continents in the 'continent' column of a dataset. For each continent, a filtered DataFrame (filtered_df) is created, containing data specific to that continent. A scatter traceA set of instructions on how to display a scatter plot. is then added to a bubble_plot figure, using the 'year' column as the x-values and 'gdpPercap' as the y-values, with data points represented as markers. Each trace is customized to display markers for the corresponding continent, with the continent name assigned as the trace’s label for the legend (name=continent). Additionally, the legendgroup=continent parameter organizes traces with the same continent value together in the legend. Marker size, opacity, and outline properties are customized, and text labels are assigned based on the 'country' column of the filtered data. This loop effectively creates individual traces for a bubble plot, with each trace representing data points for a distinct continent, enhancing visualization clarity and distinction.

  • Lines 32–37: We customize the layout of the bubble plot, including the title, x-axis label, y-axis label, and whether to show the legend.

  • Line 40: We display the finalized bubble plot figure using the show() method.

Conclusion

Bubble plots in Plotly Graph Objects offer a powerful way to visualize multidimensional data with clarity and precision. By representing data points as bubbles, we can convey information about multiple variables, such as GDP per capita and population, while also differentiating the data by categories, like continents. Customization options, including marker size, opacity, and text labels, allow for a tailored presentation of data. The use of Plotly Graph Objects provides the flexibility to create interactive and informative visualizations that enhance our understanding of complex datasets, making bubble plots a valuable tool for data exploration and communication.

Copyright ©2024 Educative, Inc. All rights reserved