...

Shape It Right

Learn to reshape and prepare structured data for clear, analysis-ready insights using pandas.

We'll cover the following...

What is tidy data?
Wide format vs. long format
Pivoting with pivot() and pivot_table()
- The pivot() method
- The pivot_table() function
Folding and unfolding with stack() and unstack()
- Unstacking
- Stacking
Wrap up

As data analysts, we rely on structure to make sense of information. Imagine trying to analyze survey responses where each answer is stored in a separate file, or trying to compare monthly sales when each month is its own column. That kind of clutter makes it nearly impossible to run clean comparisons or build effective visuals.

That’s where data reshaping comes in. Reshaping is about turning scattered, inconsistent structures into tidy, streamlined tables. This means each row is an observation, each column a variable, and every piece of data fits into place.

In this lesson, we’ll unpack what tidy data really means, explore wide vs. long formats, and get hands-on with pandas tools like melt(), pivot(), pivot_table(), stack(), and unstack(), so we can reshape any DataFrame to suit our analysis.

What is tidy data?

“Tidy” sounds like a colloquial term, right? In technical terms, however, tidy data follows three simple rules:

Each variable forms a column. Every distinct attribute or measurement is stored in its own column. For example, in a student dataset, Name, Subject, and Score should each be in separate columns, not combined into one.

This consistent and predictable structure is essential because many data manipulation and visualization tools expect data to be tidy. When data is tidy, we can easily apply filters, groupings, summaries, and charts without complicated reshaping.

Wide format vs. long format

Understanding how our data is structured is key to effective analysis and visualization. Two common data shapes, namely wide and long formats, determine how we organize variables and observations. Knowing the difference between the two helps us decide when, and how to reshape the data for different tasks.

Wide format: In wide format, similar measurements are spread across multiple columns. This can be convenient for quick human inspection, but is harder to automate.

Press + to interact

Name	Subject	Score
Alice	Math	89
Bob	Math	77

Name	Subject	Score
Alice	Math	89
Alice	Science	90

Name	Subject	Score
Alice	Math	89

Name	Subject	Room
Mr. John	Math	101

Step into Data Analysis

Talk to Data

Clean It Up!

Making Sense Out of Data

Visualization and Storytelling

Conclusion

Shape It Right

What is tidy data?

Wide format vs. long format