Search⌘ K
AI Features

Exploratory Data Analysis

Explore how to conduct exploratory data analysis (EDA) on the Iris dataset using PyCaret. Understand key visualization methods like pie charts, box plots, correlation heatmaps, and scatter plot matrices, to analyze distributions, class balance, and feature relationships essential for classification modeling.

We will now perform EDA on the Iris dataset. EDA is a fundamental part of every machine learning project because it helps us understand the fundamental statistical properties of a dataset by using visualizations.

Pie charts

Pie charts let us easily visualize the proportions of categorical variables.

Python 3.5
# Plotting pie chart
data['species'].value_counts().plot(kind='pie')
plt.ylabel('')
plt.show()

As we can see in the output, the Iris classes are evenly distributed. Each one is 33.333.3 ...