Rescale the Target Output

Learn the importance of rescaling the input and output values to achieve good results.

The need to rescale the output

We now need to think about the neural network’s outputs. We saw earlier that the outputs should match the range of values that the activation function can push out. The logistic function we’re using can’t push out numbers like 2.0-2.0 or 255255. The range is–1.0, and in fact, we can’t reach 0.00.0 or 1.01.0 because the logistic function can only approach these extremes and will never actually get there. So, it looks like we’ll have to scale our target values when training.

What should the output even be? Should it be an image of the answer? That would mean we’d have 28×28=78428\times28 = 784 output nodes.

If we take a step back and think about what we’re asking the neural network to do, we realize we’re asking it to classify the image and assign the correct label. That label is one of 1010 numbers, or 090–9. That means the network should be able to have an output layer of 1010 nodes, one for each of the possible answers, or labels. If the answer is 00, the first output layer node would fire and the rest should be silent. If the answer is 99, the last output layer node would fire and the rest would be silent.


The following table illustrates this scheme with some example outputs:

Get hands-on with 1200+ tech skills courses.