...

Shape It Right

Learn to reshape and tidy the data with pandas.

We'll cover the following...

What is tidy data?
Wide format vs. long format
Pivoting with pivot() and pivot_table()
- The pivot() method
- The pivot_table() function
Folding and unfolding with stack() and unstack()
- Unstacking
- Stacking
Conclusion

Press + to interact

Think of your dataset like a LEGO set: the same pieces can be arranged in different ways depending on what you want to build. Want to compare variables side by side? Go wide. Need to analyze trends across categories or time? Go long.

Reshaping data is not just about formatting—it’s about making data usable. In this lesson, we’ll unpack what tidy means, explore the difference between wide and long formats, and master reshaping tools in pandas—melt(), pivot(), pivot_table(), stack(), and unstack()—so we can transform DataFrame to fit the task at hand.

What is tidy data?

Tidy data is a standardized way to organize datasets that makes analysis, modeling, and visualization much easier. It follows three fundamental principles:

Each variable forms a column. Every distinct attribute or measurement is stored in its own column.
Each observation forms a row. Each row represents one complete set of measurements or attributes for a single entity or event.
Each type of observational unit forms a table. Different entities or observational types should be stored in separate tables to avoid mixing unrelated data.

This consistent and predictable structure is essential because many data manipulation and visualization tools expect data to be tidy. When data is tidy, we can easily apply filters, groupings, summaries, and charts without complicated reshaping.

Wide format vs. long format

Understanding how our data is structured helps us decide how to reshape it.

Wide format: The data spreads variables across multiple columns. For example, monthly sales might be stored in separate columns like Jan_Sales, Feb_Sales, Mar_Sales. This makes it easy to compare values side-by-side, but can become unwieldy for functions or visualizations that require data in a stacked format. Let’s create a sample dataset and see how a wide format looks like:

Press + to interact

Dive into Data Science

Talk to Data

Clean It Up

Make Sense of Data

Build Smart Stuff

Conclusion

Shape It Right

What is tidy data?

Wide format vs. long format