Search⌘ K
AI Features

Dataset Format

Explore the essential dataset formats used in PyTorch image classification. Understand how to organize images in train and validation folders, utilize ImageDataset for loading data, apply IterableImageDataset for streaming data, and augment data with AugmixDataset.

Introduction to the dataset

The PyTorch Image Model framework comes with the following Dataset classes:

  • ImageDataset
  • IterableImageDataset
  • AugMixDataset

Our training data needs to be in the following structure:

<base_folder>
├── train
│   ├── class1
│   ├── class2
│   ├── class3
│   ├── ...
│   └── classN
└── val
    ├── class1
    ├── class2
    ├── class3
    ├── ...
    └── classN

Each subfolder represents the corresponding class and contains relevant images.

The ImageDataset class

We can use the ImageDataset classi to create the training, validation, and test datasets for our image classification model.

It accepts the following arguments:

class ImageDataset(root, parser, class_map, load_bytes, transform) ->
...