Data Handling

Learn how large data is loaded and handled in Python.

In this lesson, we will be using some famous data sets to practice handling data. We start with the iris dataset that has been a benchmark for many traditional statistics and machine learning methods. We then briefly explore the MNIST data of handwritten digits and some basic image-handling routines.

Basic plots of iris data

Since machine learning requires data, we’re commonly faced with importing data from files. There are a variety of tools to handle specific file formats. The most basic one is to read data from text files. We can then manipulate the data and plot it in a form that can help us to gain insights into the information we want to get from the data.

We will discuss some classical machine learning examples. This data is now often included in the libraries, so it will save us some time. However, preparing data to be used in machine learning is a large part of applying machine learning in practice. The examples in this lesson are provided in the program HouseMNIST.ipynb.

Get hands-on with 1200+ tech skills courses.