Fundamentals of Machine Learning for Software Engineers/

...

Get to Know the Problem

Explore supervised learning by solving a real-life problem and mapping the data onto 2D-graph.

We'll cover the following...

The problem statement
Supervised pizza
Make sense of the data

The problem statement

Our friend owns a cozy little pizzeria in a busy metropolitan city. Every day at noon, they check the number of reserved seats and decide how much pizza dough to prepare for dinner. Too much dough, and it goes wasted, but too little, and they run out of pizzas. In either case, the restaurant loses money.

It’s not always easy to gauge the number of pizzas from the reservations. Many customers don’t reserve a table, or they eat something other than pizza. The owner knows that there is some kind of link between those numbers, in that more reservations generally mean more pizzas, but other than that, it’s not clear what the exact relation is.

The restaurant owner wants a program that looks at historical data, grasps the relation between reserved seats and pizzas and uses it to forecast tonight’s pizza sales from today’s reservations. Can we code such a program for them?

Supervised pizza

Remember what we learned back in Supervised Learning’s lesson? We can solve the pizza forecasting problem by training a supervised learning algorithm with a bunch of labeled examples. To get the examples, we ask the restaurant owner to jot down a few days’ worth of reservations and pizzas and collect those data in a file. Here’s what the first four lines of that file look like:

Reservations	Pizzas
13	33
2	16
14	32
23	51

The file contains 30 lines of data. Each is an example, composed of an input variable (the reservations) and a numerical label (the pizzas). Once we have an algorithm, we can use these examples to train it. Later on during the prediction phase, we can pass a specific number of reservations to the algorithm and ask it to come up with a matching number of pizzas.

Let’s start with the numbers as a data scientist would.

Make sense of the data

If we glance at the pizza examples, it seems that the reservations and pizzas are correlated.

The NumPy library has a convenient function to import whitespace-separated data from text:

import numpy as np
X, Y = np.loadtxt("pizza.txt", skiprows=1, unpack=True)

The first line imports the NumPy library, and the second uses NumPy’s loadtxt() function to load the data from the pizza.txt file. Then we skip the headers row, and “unpack” the two columns into separate arrays called $X$ and $Y$ . $X$ contains the values of the input variable, and $Y$ contains the labels. We use uppercase names for $X$ and $Y$ , because that’s a common Python convention to indicate that a variable should be treated as a constant.

Let’s peek at the data to make sure they are loaded okay. If we wish to follow along, send the two lines given before, and then check out the first few elements of $X$ and $Y$ :

➾ $X$ [0:5]
[ 13. 2. 14. 23. 13.]
➾ $Y$ [0:5]
[ 33. 16. 32. 51. 27.]

The numbers are consistent with Roberto’s file, but it’s still hard to make sense of them. Plot them on a chart for clarity:

Press + to interact

Python 3.5

Files

# Plot the reservations/pizzas dataset.
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sea
sea.set()
plt.axis([0, 50, 0, 50])                                 # scale axes (0 to 50)
plt.xticks(fontsize=14)                                  # set x axis ticks
plt.yticks(fontsize=14)                                  # set y axis ticks
plt.xlabel("Reservations", fontsize=14)                  # set x axis label
plt.ylabel("Pizzas", fontsize=14)                        # set y axis label
X, Y = np.loadtxt("pizza.txt", skiprows=1, unpack=True)  # load data
plt.plot(X, Y, "bo")                                     # plot data
plt.show()                                               # display chart

How Machine Learning Works

Our First Learning Program

Walking the Gradient

Hyperspace

A Discern Machine

Get Real

The Final Challenge

The Perceptron

Designing the Network

Building the Network

Training the Network

How Classifiers Work

Batchin’ Up

The Zen of Testing

Let’s Do Development

A Deeper Kind of Network

Diabetes Prediction Using Keras

Defeating Overfitting

Taming Deep Networks

Beyond Vanilla Networks

Into the Deep

Recognize Handwritten Digits Using a Deep Neural Network

Machine Learning Fundamentals

Get to Know the Problem

The problem statement

Supervised pizza

Make sense of the data