Debugging Chains
Explore methods to debug chained operations in pandas effectively. Learn how to use commenting, the pipe function, and Jupyter's pdb debugger to inspect and troubleshoot intermediate DataFrame states safely and efficiently.
We'll cover the following...
In this section, we’ll explore debugging chains of operations on DataFrames or Series. Almost universally, pandas code is a bit messy. We get it. The chaining produces less code. The pandas library is an in-memory library that works by copying data, this argument is a moot point. Let’s address the debugging complaint.
We’re going to see a “tweak” function that analyzes the fuel economy data.
Here is our tweak function:
Say we come across this tweak_autos function, and we want to understand what it does. First of all, realize that it’s written like a recipe, step by step:
- Pull out columns found in columns.
- Create various columns (
assign). - Convert column types (
astype). - Drop extra columns that are no longer needed after we’ve created new columns from them (
drop).
Those who don’t support chaining say there’s no way to debug this. We have a few ways to debug the chain. The first is by using comments. We comment out all of the operations and then go through them one at a time. This comes in really handy to visually see what’s ...