Search⌘ K

Identifying Exploratory Questions with EDA

Explore how to identify meaningful exploratory questions during the early stages of data analysis with EDA. Understand dataset features, variables, and how to formulate insightful queries that guide visual storytelling and trend analysis.

What is EDA?

Exploratory data analysis, or EDA, is the process of exploring and analyzing our data through visualizations, statistics, and other methods of data storytelling. One of the first steps to EDA is identifying the types of questions we can explore with the dataset.

Initial data analysis

Let's start with loading a sample dataset, called the Gapminder dataset, helpfully preloaded into a pandas DataFrame using the Plotly package.

Python 3.10.4
import plotly
#Import the gapminder dataset
gapminder_data = plotly.data.gapminder()
#Print the feature names and head of the dataframe
print(gapminder_data.columns.tolist())
print(gapminder_data.head(10))

Let's take a look at the individual variables, and for those with acronyms, ...