Path Handling with pathlib

Explore how to use Python's pathlib module to handle file system paths as objects instead of plain strings. Learn to construct, inspect, check, iterate, and perform file I/O with paths in a cross-platform, readable, and reliable way.

We'll cover the following...

Creating and joining paths
Inspecting Path components
Checking existence and type
Iterating and searching directories
File I/O shortcuts

For many years, Python developers managed file paths by manipulating raw strings, i.e., manually joining directory names, handling forward slashes (/) on Unix systems versus backslashes (\) on Windows, and relying heavily on the os module to smooth over these differences. This approach was fragile and prone to subtle bugs.

Modern Python addresses this problem with the pathlib module. Rather than representing file paths as plain text, pathlib treats them as objects with well-defined behaviors. These path objects automatically handle operating system differences and provide clear, expressive methods for common file system tasks. By using pathlib, we can write cleaner, more readable code that works consistently across platforms, without worrying about platform-specific path syntax.

Creating and joining paths

From the pathlib module, we need to import the Path class. When we create a Path object, we are not simply storing a string; we are creating a structured representation of a location in the file system.

One of the most powerful and elegant features of pathlib is how paths are constructed. Instead of calling helper functions like os.path.join(), we use the division operator (/). Python overloads this operator so that it intuitively combines path components, automatically inserting the correct path separator for the operating system. This approach makes path construction readable and expressive, while remaining fully portable across Windows, macOS, and Linux.

Line 4: We initialize the base path using Path .cwd(). This sets the base directory to the current working directory, allowing relative paths to resolve from the execution location and avoiding hardcoded absolute paths such as C:\Users\..., which would not work on other systems.
Line 8: The / operator is the key innovation here. Instead of treating paths as text to be glued together, pathlib overrides this mathematical operator to perform “path joining.” It automatically inserts the correct separator for the OS (Windows or Linux/Mac), ensuring the path is valid everywhere.
Line 9: We continue extending the path object by appending the filename. The result error_log remains a Path object, retaining all the useful methods we will use later.

Inspecting `Path` components

Once we have a Path object, we can extract metadata about it instantly. Because the path is an object, it has attributes that describe its structure. We do not need to write complex string parsing logic or regular expressions to find a file extension or a parent folder name.

We can access the full filename, the name without extension, the extension itself, and the containing directory using simple attributes.

Line 4: We instantiate a Path object purely in memory. This demonstrates that Path objects are useful even if the file doesn't exist yet; they function as smart parsers that understand path structure immediately.
Line 6: We access .name to get the final component of the path (yearly_financials.csv). This isolates the specific file we are targeting, ignoring the directory tree above it.
Lines 7–8: pathlib pre-calculates the filename parts for us. .stem gives us the name (yearly_financials) for display or logic, while .suffix isolates the extension (.csv) which is crucial for deciding how to open or process the file.
Line 9: .parent allows us to navigate up one level. This is useful when we have a file path but need to save a sibling file in the same folder.

Checking existence and type

Before attempting to open a file or read a directory, we must verify that it exists and that it is the correct type. A path might point to a file, a directory, or nothing at all. The pathlib provides boolean methods to check these states, allowing us to write defensive code that prevents runtime errors.

Line 5: The .touch() method mimics the Linux command of the same name. It ensures the file exists (creating an empty one if necessary) so we have a valid target for our existence checks below.
Line 7: .exists() is our primary safety check. It queries the actual file system to ensure the path is valid before we try to perform operations that would crash if the file were missing.
Lines 8–10: We distinguish between types using .is_file() and .is_dir(). This is critical because trying to "open" a directory or "iterate" a text file will raise errors; these checks ensure we apply the correct logic to the correct type.
Line 16: .unlink() is the modern, object-oriented equivalent of delete. We call it directly on the object we want to remove.

Iterating and searching directories

In many applications, we need to work with groups of files, for example, processing every .txt file in a directory. The pathlib module makes this kind of batch processing straightforward.

To iterate over all entries in a directory, we use the .iterdir() method. It returns an iterator of Path objects representing every file and subdirectory in that location.

When we want to select only certain files, we use .glob(). This method supports wildcard patterns similar to those used in the command line, allowing us to filter paths by name or extension, for example, matching all .txt files or all files that start with a specific prefix.

By combining iterdir() and glob(), we can efficiently traverse directories and target exactly the files we need without manual string manipulation.

Python

from pathlib import Path
# Setup: Create a directory with some dummy files
data_dir = Path("dataset")
data_dir.mkdir(exist_ok=True)
(data_dir / "image1.png").touch()
(data_dir / "image2.jpg").touch()
(data_dir / "notes.txt").touch()
print("--- All Files ---")
# iterdir() loops through everything in the folder
for item in data_dir.iterdir():
    print(item.name)
print("\n--- Only Images ---")
# glob() filters files matching the pattern
for image in data_dir.glob("*.png"):
    print(f"Found image: {image.name}")
# Cleanup
import shutil
shutil.rmtree(data_dir)

Line 6: We create the directory using .mkdir(exist_ok=True). The exist_ok=True parameter is a best practice; it prevents the script from crashing if the folder was already created by a previous run.
Line 13: .iterdir() creates a generator that yields a new Path object for every item inside the folder. This allows us to loop through contents one by one without loading a massive list into memory.
Line 18: .glob("*.png") applies a filter directly to the directory scan. Instead of iterating everything and writing an if statement to check extensions, we let the filesystem engine efficiently retrieve only the files that match the wildcard pattern.

File I/O shortcuts

For simple file operations, we do not always need the full with open(...) pattern. Path objects provide convenient helper methods such as .read_text() and .write_text() that handle opening the file, performing the read or write, and closing the file automatically. These methods are ideal when working with small files, such as configuration files, simple scripts, or short logs, where it is safe to load or replace the entire file contents at once.

For large files or streaming scenarios, the traditional with open(...) approach remains the better choice.

Line 6: .write_text() abstracts away the entire “open-write-close” cycle. It automatically manages the file resource and closes the file when the block exits, preventing file handle leaks. If the file already exists, it is opened in write mode and its existing contents are overwritten.
Line 9: .read_text() similarly simplifies reading. It opens the file, decodes the bytes (defaulting to UTF-8), returns the full string, and closes the file immediately. This makes loading small text files a valid one-liner.

We have moved from thinking of paths as messy strings to handling them as smart objects. By using pathlib, we ensure our file operations are robust, readable, and ready for any operating system.

1.Get Started

2.Orientation and First Code

3.Variables, Types, and Expressions

4.Flow Control

5.Data Collections

6.Functions and Scope

7.Comprehensions and Functional Tools

Breakout Session

Mini Project

8.The Python Object Model

9.Organizing Code with Modules

10.Creating and Using Classes

11.Object Relationships and Data Model

Breakout Session

12.Error Handling and Debugging

13.Testing and Code Quality

14.Working with Files

15.Iterators and Generators

Mini Project

16.Working with Data and APIs

17.Managing Packages and Environments

18.Parallel and Concurrent Programming

19.Asynchronous Programming

Breakout Session

20.Advanced Functions and Decorators

21.Introspection and Type-Level Metaprogramming

22.Building and Distributing Applications

Mini Project

23.Wrapping Up

Path Handling with pathlib

Creating and joining paths

Inspecting `Path` components

Checking existence and type

Iterating and searching directories

File I/O shortcuts

Path Handling with pathlib

Creating and joining paths

Inspecting Path components

Checking existence and type

Iterating and searching directories

File I/O shortcuts

Inspecting `Path` components