Search⌘ K
AI Features

Loading Image Dataset

Explore how to load and preprocess an image dataset using JAX and TensorFlow tools. Understand dataset extraction, labeling, visualization, scaling, augmentation, and batch generation. This lesson prepares you to manage image data effectively for model training and evaluation.

Loading image dataset

Let’s now see how we can load image data. We’ll use the Cats and Dogs images. We start by extracting the dataset from the zip file.

Python 3.10.4
import zipfile
with zipfile.ZipFile('../train.zip', 'r') as zip_ref:
zip_ref.extractall('.')

In the code above:

  • Line 1: We import the zipfile library.

  • Lines 3–4: We call the ZipFile() method of the zipfile module to open the zip file in read mode as zip_ref. We use the with statement to automatically close the file after the code execution. We call the extractall() method to extract the content of the zip file in the current directory.

Next, we create a pandas DataFrame containing the labels and paths to the images.

Python 3.8
import pandas as pd
base_dir = 'train'
filenames = os.listdir(base_dir)
categories = []
for filename in filenames:
category = filename.split('.')[0]
if category == 'dog':
categories.append("dog")
else:
categories.append("cat")
df = pd.DataFrame({'filename': filenames,'category': categories})
print(df)

In the code above:

  • Line 1: We import the pandas library as pd.

  • Lines 2–3: We define the base directory base-dir that contains the images for training the model. We call the listdir() method of the os module to get all file names present in the ...