Overview

Samples as tests refers to using real data as fixtures to evaluate the performance of a machine learning model. This is often a more effective approach than synthetic data, because synthetic data may not accurately reflect the complexity and variability of real-world data.

Several types of real data can be used as samples for testing, including:

corner cases (edge cases that are unusual or extreme).
hard samples (challenging samples).
representative samples (samples that accurately reflect the overall characteristics of the data).
holdout sets (data that is withheld during training and used to evaluate the model’s performance).

However, there are also some disadvantages to using samples as tests. For example, if a model performs well on a particular sample, it does not ...

Introduction to Reliable ML

Software Testing

Best and Worst Practices

ML-Specific Tests

ML Software Reliability outside of Tests

Wrapping Up

Appendix

Samples as Tests

Overview