Search⌘ K
AI Features

Dataset Iteration

Learn to iterate through datasets using TensorFlow by creating Iterators to extract and transform data observations. Understand how to handle batching, shuffling, and repeating data streams efficiently, enabling effective data pipeline management for scalable machine learning applications.

Chapter Goals:

  • Learn how to iterate through a dataset and extract values from data observations
  • Implement a function that iterates through a NumPy-based dataset and extracts the feature data

A. Iterator

The previous few chapters focused on creating and configuring datasets. In this chapter, we’ll discuss how to iterate through a dataset and extract the data.

To iterate through a dataset, we need to create an Iterator object. There are a few different ways to create an Iterator, but we’ll focus on the simplest and most commonly used method, which is the make_one_shot_iterator function.

Python 3.5
import numpy as np
import tensorflow as tf
data = np.array([[1., 2.],
[3., 4.]])
dataset = tf.compat.v1.data.Dataset.from_tensor_slices(data)
dataset = dataset.batch(1)
it = tf.compat.v1.data.make_one_shot_iterator(dataset)
next_elem = it.get_next()
print(next_elem)
added = next_elem + 1
print(added)

In the example, it represents an Iterator for dataset. The get_next function returns something we’ll refer to as the next-element tensor.

The next-element tensor represents the batched data observation(s) at each iteration through the dataset. We can even apply operations or transformations to the next-element tensor. In the example ...