What are radial basis function neural networks?
In the realm of machine learning and artificial intelligence, Neural Networks (NN) have established their prominence due to their remarkable ability to learn from data and make predictions or decisions without being explicitly programmed to perform the task. Among various types of neural networks, radial basis function neural networks (RBFNN) are a unique class that have proved to be highly effective in various applications including function approximation, time series prediction, classification, and control.
In this Answer, we will comprehensively explore the constituents and functionality of radial basis function neural networks.
What are radial basis function neural networks?
A radial basis function (RBF) neural network is a type of artificial neural network that uses radial basis functions as activation functions. It typically consists of three layers: an input layer, a hidden layer, and an output layer. The hidden layer applies a radial basis function, usually a Gaussian function, to the input. The output layer then linearly combines these outputs to generate the final output. RBF neural networks are highly versatile and are extensively used in pattern classification tasks, function approximation, and a variety of machine learning applications. They are especially known for their ability to handle non-linear problems effectively.
Structure of RBF neural networks
An RBF neural network typically comprises three layers:
Input layer: This layer simply transmits the inputs to the neurons in the hidden layer.
Hidden layer: Each neuron in this layer applies a radial basis function to the inputs it receives.
Output layer: Each neuron in this layer computes a weighted sum of the outputs from the hidden layer, resulting in the final output.
Here's the basic flow diagram of the RBF neural network:
Mathematical background
The output of an RBF network is a linear combination of radial basis functions. It is given by:
where:
-
is the input vector
-
is the number of neurons in the hidden layer
-
are the weights of the connections from the hidden layer to the output layer
-
are the centers of the radial basis functions
-
is the
between the input vector and the center of the radial basis functionEuclidean distance Euclidean distance is a measure of the straight line distance between two points in a space, computed using Pythagoras’ theorem. -
is the radial basis function, usually chosen to be a Gaussian function:
Note: To learn more about Gaussian function or Gaussian distribution, refer to this Answer.
How do RBF neural networks work?
Radial basis function networks (RBFNs) work by comparing the input to known examples from the training data to classify it.
Here’s a simplified explanation:
RBFNs start with an input vector. This vector is fed into the input layer of the network.
The network also has a hidden layer, which comprises radial basis function (RBF) neurons.
Each of these RBF neurons has a center, and they measure how close the input is to their center. They do this using a special function called a
. The output of this function is higher when the input is close to the neuron’s center and lower when the input is far away.Gaussian transfer function The Gaussian transfer function is a type of function that gives high output values for inputs close to its center and low values for inputs further away. The outputs from the hidden layer are then combined in the output layer. Each node in the output layer corresponds to a different category or class of data. The network determines the input’s class by calculating a weighted sum of the outputs from the hidden layer.
The final output of the network is a combination of these weighted sums, which is used to classify the input.
Here's a visual representation of the above explanation:
Understanding RBFNN in a fun way
Let’s think of radial basis function networks (RBFNs) as a team of detectives trying to solve a mystery.
The mystery is the input data. It’s like a puzzle that needs to be solved or a question that needs to be answered.
The detectives are the neurons in the hidden layer. Each detective has a special area of expertise or a “center.” They are good at solving mysteries close to their area of expertise.
When a new mystery comes in, each detective compares it to their area of expertise using a tool called a Gaussian transfer function. This tool tells them how similar the mystery is to what they know best. If the mystery is very similar to their area of expertise, the tool gives a high score. If it’s very different, the tool gives a low score.
Once all the detectives have scored the mystery, their scores are combined in a way that gives more weight to the most confident detectives. This is the weighted sum. This is done in the output layer. Each node in this layer represents a different possible solution to the mystery.
The final solution to the mystery is the one that gets the highest combined score from all the detectives. This is how the network classifies the input data.
So, in a nutshell, RBFNs solve mysteries (classify input data) by letting a team of expert detectives (neurons) compare the mystery to their areas of expertise and combine their scores to find the best solution.
How do Radial Basis Function Networks (RBFNs) classify input data?
Training an RBF network
Training an RBF network involves two steps:
Determining the centers
c_iand the parameterβof the radial basis functions: This can be done using a clustering algorithm like K-means on the training data.Determining the weights
w_i: This can be done using a linear regression algorithm on the outputs of the hidden layer.
Python implementation
Here's a simple implementation of a Radial Basis Function Network (RBFN) using Python. This code creates a simple RBFN and trains it on some dummy data.
This implementation will use:
KMeans for clustering and defining the radial basis functions.
Linear regression for learning the weights.
import numpy as npimport matplotlib.pyplot as pltfrom sklearn.cluster import KMeansfrom sklearn.datasets import make_classificationfrom sklearn.linear_model import LinearRegressionfrom sklearn.metrics import accuracy_scoreimport scipy.spatial.distance as distanceclass RadialBasisFunctionNeuralNetwork:def __init__(self, num_of_rbf_units=10):self.num_of_rbf_units = num_of_rbf_unitsdef _rbf_unit(self, rbf_center, point_in_dataset):return np.exp(-self.beta * distance.cdist([point_in_dataset], [rbf_center], 'euclidean')**2).flatten()[0]def _construct_interpolation_matrix(self, input_dataset):interpolation_matrix = np.zeros((len(input_dataset), self.num_of_rbf_units))for idx, point_in_dataset in enumerate(input_dataset):for center_idx, rbf_center in enumerate(self.rbf_centers):interpolation_matrix[idx, center_idx] = self._rbf_unit(rbf_center, point_in_dataset)return interpolation_matrixdef train_model(self, input_dataset, target_dataset):self.kmeans_clustering = KMeans(n_clusters=self.num_of_rbf_units, random_state=0).fit(input_dataset)self.rbf_centers = self.kmeans_clustering.cluster_centers_self.beta = 1.0 / (2.0 * (self.kmeans_clustering.inertia_ / input_dataset.shape[0]))interpolation_matrix = self._construct_interpolation_matrix(input_dataset)self.model_weights = np.linalg.pinv(interpolation_matrix.T.dot(interpolation_matrix)).dot(interpolation_matrix.T).dot(target_dataset)def predict(self, input_dataset):interpolation_matrix = self._construct_interpolation_matrix(input_dataset)predicted_values = interpolation_matrix.dot(self.model_weights)return predicted_valuesif __name__ == "__main__":# Generating a simple classification datasetinput_dataset, target_dataset = make_classification(n_samples=500, n_features=2, n_informative=2, n_redundant=0, n_classes=2)# Initializing and training the RBF neural networkrbf_neural_network = RadialBasisFunctionNeuralNetwork(num_of_rbf_units=20)rbf_neural_network.train_model(input_dataset, target_dataset)# Predicting the target valuespredictions = rbf_neural_network.predict(input_dataset)# Converting continuous output to binary labelsbinary_predictions = np.where(predictions > 0.5, 1, 0)# print("Accuracy: {}".format(accuracy_score(target_dataset, binary_predictions)))print(f"Accuracy: {accuracy_score(target_dataset, binary_predictions)}")# Plotting the resultsplt.scatter(input_dataset[:, 0], input_dataset[:, 1], c=binary_predictions, cmap='viridis', alpha=0.7)plt.scatter(rbf_neural_network.rbf_centers[:, 0], rbf_neural_network.rbf_centers[:, 1], c='red')plt.title('Classification Result')plt.show()
Note: Upon clicking the Run button, the first output shows the plot and the second output shows the Accuracy.
Code explanation
Let’s break down the code:
Lines 1–7: Importing the required libraries. We use
numpyfor numerical operations,matplotlibfor plotting,sklearn.cluster.KMeansfor unsupervised clustering,sklearn.datasets.make_classificationfor creating a classification dataset,sklearn.linear_model.LinearRegressionfor the linear regression model,sklearn.metrics.accuracy_scorefor evaluating the model, andscipy.spatial.distancefor calculating distances.Lines 9–11: Defining the
RadialBasisFunctionNeuralNetworkclass and its constructor. The class represents a radial basis function neural network, and its constructor takes one argument - the number of radial basis functions (hidden units).Lines 13–14: The
_rbf_unitmethod is defined. This is a helper function to calculate the output of a radial basis function (RBF). It computes the between a point and an RBF center, squares it, multiplies it by a negative beta, and finally applies the exponential function.Euclidean distance Euclidean distance is a measure of the straight line distance between two points in a space, computed using Pythagoras' theorem. Lines 16–21: The
_construct_interpolation_matrixmethod is defined. It creates an where each entry corresponds to the output of an RBF given an input data point. This matrix is needed to compute the weights in the RBFNN.“interpolation matrix” The interpolation matrix in the context of Radial Basis Function Neural Networks (RBFNNs) is a matrix where each entry represents the output of a radial basis function given an input data point. Lines 23–28: The
train_modelmethod is defined. It first usesKMeansclustering to find the centers of the RBFs. It then computes beta based on the average squared distance between data points and their nearest cluster center (the inertia). Finally, it computes the weights that connect the RBFs to the output layer using a of thepseudoinverse The pseudoinverse, calculated here using `np.linalg.pinv()`, is a generalization of the matrix inverse that is used for solving systems of linear equations, especially for non-square matrices or matrices that are not of full rank. and the target values.interpolation matrix The interpolation matrix in the context of Radial Basis Function Neural Networks (RBFNNs) is a matrix where each entry represents the output of a radial basis function given an input data point. Lines 30–33: The
predictmethod is defined. It computes the interpolation matrix for the input data and uses it with the weights to predict the output values.Lines 36–46: The main program execution starts. A classification dataset is generated using the
make_classificationfunction fromsklearn. Then an instance of the RBFNN is created and trained on the data. The trained model is then used to predict labels for the input data.Line 48: The continuous output from the predict function is converted to binary class labels based on a threshold of
0.5.Line 51: The
accuracyof the model on the training data is printed to the console.Lines 54–57: A
scatterplot of the data is created, where the color of each point indicates its predicted class label. The centers of the RBFs are also plotted as red points. The plot is displayed usingplt.show().
Conclusion
Radial basis function neural networks are powerful tools for function approximation problems. They are relatively simple to implement and can model complex non-linear relationships. However, they require careful tuning of their parameters and may not be suitable for high-dimensional data due to the curse of dimensionality.
Test your knowledge
Contains RBF neurons with Gaussian transfer functions
RBF neurons
Measure the input’s similarity to their centers
Output layer
Determines the output based on the distance from the neuron’s center
Hidden layer
Combines the scores from the hidden layer to classify the input
Gaussian transfer function
Free Resources