Search⌘ K

Get the Dataset Ready

Explore how to properly open and read the MNIST dataset file in Python, handling it efficiently by reading all lines into a list. Understand the importance of closing files to manage resources and avoid conflicts. This lesson helps you prepare your dataset, gain access to image pixel values, and label data essential for training a neural network effectively.

Open the dataset

Before we can do anything with the data, like plotting it or training a neural network with it, we need to find a way to get at it from our Python code.

Opening a file and getting its content is easy in Python. Let’s look at the following code:

Python
data_file = open("mnist_train_100.csv", 'r')
data_list = data_file.readlines()
data_file.close()

There are only three lines of code here. Let’s go through each one.

Code explanation

Open the file

The first line uses a function open() to open a file. We can see the first parameter passed to the function is the name of the file. Actually, it is more than just the filename mnist_train_100.csv, it is the whole path, which includes the directory of the file.

The second ...