Search⌘ K

Samples as Tests

Learn how to use samples of real data for testing.

Overview

Samples as tests refers to using real data as fixtures to evaluate the performance of a machine learning model. This is often a more effective approach than synthetic data, because synthetic data may not accurately reflect the complexity and variability of real-world data.

Several types of real data can be used as samples for testing, including:

  • corner cases (edge cases that are unusual or extreme).

  • hard samples (challenging samples).

  • representative samples (samples that accurately reflect the overall characteristics of the data).

  • holdout sets (data that is withheld during training and used to evaluate the model’s performance).

However, there are also some disadvantages to using samples as tests. For example, if a model performs well on a particular sample, it does not ...