How to make a scatterplot in pandas

pandas is a popular Python-based data analysis toolkit that can be imported using:

import pandas as pd.

It presents a diverse range of utilities, ranging from parsing multiple file-formats to converting an entire data table into a NumPy matrix array. This property makes pandas a trusted ally in data science and machine learning.

pandas can help in the creation of multiple types of data analysis graphs. One such example is the scatterplot.

A scatter plot is implemented when comparing large numbers of data points with no regard to time. This is a very powerful chart type that can deployed to show the relationship between two variables (e.g., the height and weight of a person).

The default implementation of the scatter plot is:

DataFrame.plot.scatter(x = None y= None, s= None, c= None, **kwargs)

Parameters

  • x: int or string - The column name or position to be used as horizontal coordinate for each point.

  • y: int or string - The column name or position to be used as vertical coordinate for each point.

  • s: str, scalar, or array-like - The size of each point, possibly:

  • A string with the name of the column to be used for marker’s size.
    - A single scalar so all points have the same size.
    - A sequence of scalars that will be used recursively.
  • c: str, array-like, int - Color for each column. Possible values are:
    -Single string referenced in RGB, or RGBA code - used for all columns.
    -Array referenced in RGB, or RGBA code - used for columns recursively.
    -Column name or position specified according to a colormap.

  • **kwargs:Keyword arguments to pass on to DataFrame.plot().

#import library
import pandas as pd
#add csv file to dataframe
df = pd.DataFrame({'length': [5, 8, 9, 8, 2, 3, 9, 9, 2, 3], 'width': [9, 3, 5, 2, 1, 3, 4, 5, 7, 5]})
#create bar graph
bargraph = df.plot.scatter(x = 'length', y = 'width', c = 'red')

Free Resources

Copyright ©2025 Educative, Inc. All rights reserved