How to plot Andrews curves in pandas
Overview
Andrews curves visualize multidimensional/high-dimensional data by mapping each observation onto a function. This function is defined as follows:
- The coefficients represent the values of each dimension.
- The is linearly spaced between and .
Andrews curves have been known to retain means, distance (up to a constant), and variances. As a result, Andrews curves represented by closely spaced functions imply that the accompanying data points will be closely spaced.
The andrews_curves() method in pandas
The andrews_curves() method in pandas is used to plot Andrews curves on a DataFrame. Each frame row represents a single curve.
Syntax
pandas.plotting.andrews_curves(frame, class_column, ax=None, samples=200, color=None, colormap=None, **kwargs)
Parameters
frame: This is the DataFrame to plot.class_column: This is the name of the column containing class names.ax: This is thematplotlibaxes object.samples: This corresponds to the number of points to plot in each curve.color: This parameter can be a list or tuple of colors that can be used for different classes.colormap: This can be a string or amatplotlibobject where colors can be selected from the colormap.
Example
import pandas as pdimport matplotlib.pyplot as pltdf = pd.read_csv('https://raw.github.com/pandas-dev/''pandas/main/pandas/tests/io/data/csv/iris.csv')print(df.head())pd.plotting.andrews_curves(df, 'Name')plt.show()
Explanation
- Lines 1–2: We import the
pandasandmatplotlibpackages. - Lines 4–7: We read the iris dataset into a DataFrame called
df. - Line 8: The sample data from
dfis printed. - Line 9: We plot the Andrews curves using the
andrews_curves()method. Here, theNamecolumn in the dataset/DataFrame is a categorical column consisting of class names. - Line 10: We display the plotted graph.