Understanding Deep Learning Applications in Rare Event Prediction/

...

Understanding Multi-Layer Perceptrons

Unveil the structure and operational flow of MLPs, from input processing to backpropagation in deep learning.

We'll cover the following...

MLP architecture
MLP workflow

Press + to interact

A typical visual representation of an MLP is shown in the illustration above where:

$\text{X}_\text{1} \text{-} \text{X}_\text{3}$ on the left represent the inputs.
The middle nodes represent the hidden layers.
They layer on the right is the output.

This high-level representation shows the feed-forward nature of the network. In a feed-forward network, information between layers flows in only the forward direction. Information (features) learned at a layer is not shared with any prior layer.

The abstracted network shown in the illustration above is unwrapped to its elements in the illustration below.

Press + to interact

MLP workflow

The journey through an MLP involves a series of meticulously orchestrated steps, each contributing to the network’s ability to learn and make predictions. Each element, its interactions, and its implementation in the context of TensorFlow are explained step-by-step as follows:

Data introduction: The process starts with a dataset. Suppose a dataset is shown at the top left, $X_{n×p}$ , with $n$ samples and $p$ features.
Batch selection: The model ingests a randomly selected batch during training. The batch contains random samples (rows) from $X$ unless otherwise mentioned. The batch size is denoted as $n_b$ here.
Independent processing: By default, the samples in a batch are processed independently. Their sequence is, therefore, not necessary.
Entering the input layer: The input batch enters the network through an input layer. Each node in the input layer corresponds to a sample feature. Explicitly defining the input layer is optional, but it is done here for clarity.

Navigating hidden layers: The input layer is followed by a stack of hidden layers till the last (output) layer. These layers perform the “complex” interconnected nonlinear operations. Although perceived as complex, the underlying operations are rather simple arithmetic computations.
Node functionality: A hidden layer is a stack of computing nodes. Each node extracts a feature from the input. For example, in the sheet-break problem, a node at a hidden layer might determine whether the rotations between two specific rollers are out of sync or not. A node can, therefore, be imagined as solving one arbitrary subproblem.
Feature mapping: The stack of output coming from a layer’s nodes is called a feature map or representation. The size of the feature map, also equal to the number of nodes, is called the layer size.
Layer-to-layer transmission: Intuitively, this feature map has results of various subproblems solved at each node. They provide predictive information for the next layer until the output layer to predict the response.
Perceptron basics: Mathematically, a node is a perceptron made of weights and bias parameters. The weights at a node are denoted with a vector $w$ and a bias $b$ .
Processing layer inputs: All the input sample features go to a node. The input to the first hidden layer is the input data features $x = \{x_{1}, . . . , x_{p}\}$ . For any intermediate layer, it’s the output (feature map) of the previous layer, denoted as $z = \{z_{1},...,z_{m}\}$ , where $m$ is the size of the prior layer.
Feature extraction logic: Consider a hidden layer $l$ of size $m_l$ in the illustration. A node $j$ in the layer $l$ performs a feature extraction with a dot product between the input feature map $z^{(l−1)}$ ...

Getting Started

Rare Event Prediction

Multi-Layer Perceptrons (MLPs)

Long Short-Term Memory (LSTM) Networks

Convolutional Neural Networks (CNNs)

Autoencoders

Conclusion

Understanding Multi-Layer Perceptrons

MLP architecture

MLP workflow