An Incremental Strategy for Partitioning
Explore an incremental approach to data partitioning using Python's abstract base classes and operator overloading. Understand how to implement concrete subclasses that allocate data objects into training and testing subsets based on counting logic without extending list features.
We'll cover the following...
Overview
We have an alternative to splitting a single list after it’s been built. Instead of extending the list class to provide two sub-lists, we can reframe the problem slightly. Let’s define a subclass of SamplePartition that makes a random choice between testing and training on each SampleDict object that is presented via initialization, or the append() or extend() methods.
Abstract methods
Here’s an abstraction that summarizes our thinking on this. We’ll have three methods for building a list, and two properties that will provide the training and testing sets, as below.
We don’t inherit from list because we’re not providing any other list-like features, not even __len__(). The class has only five methods, as shown:
This definition has no concrete implementations. It provides five placeholders where methods can be defined to implement the necessary dealing algorithm. We’ve changed the definition of the ...