How to make a scatterplot in pandas

pandas is a popular Python-based data analysis toolkit that can be imported using:

import pandas as pd.

It presents a diverse range of utilities, ranging from parsing multiple file-formats to converting an entire data table into a NumPy matrix array. This property makes pandas a trusted ally in data science and machine learning.

pandas can help in the creation of multiple types of data analysis graphs. One such example is the scatterplot.

A scatter plot is implemented when comparing large numbers of data points with no regard to time. This is a very powerful chart type that can deployed to show the relationship between two variables (e.g., the height and weight of a person).

The default implementation of the scatter plot is:

DataFrame.plot.scatter(x = None y= None, s= None, c= None, **kwargs)

Parameters

x: int or string - The column name or position to be used as horizontal coordinate for each point.
y: int or string - The column name or position to be used as vertical coordinate for each point.
s: str, scalar, or array-like - The size of each point, possibly:

A string with the name of the column to be used for marker’s size.
- A single scalar so all points have the same size.
- A sequence of scalars that will be used recursively.

c: str, array-like, int - Color for each column. Possible values are:
-Single string referenced in RGB, or RGBA code - used for all columns.
-Array referenced in RGB, or RGBA code - used for columns recursively.
-Column name or position specified according to a colormap.
**kwargs:Keyword arguments to pass on to DataFrame.plot().

Free Resources