Restricted Boltzmann Machines

Learn how simple networks can “learn” the distribution of image data and serve as building blocks for larger networks.

The neural network model that we will apply to the MNIST data has its origins in earlier research on how neurons in the mammalian brain might work together to transmit signals and encode patterns as memories. As we discussed earlier, Hebbian Learning states, “Neurons that fire together, wire togetherGurney, Kevin. 2018. An Introduction to Neural Networks. CRC Press.,” and many models, including the multilayer perceptron, made use of this idea to develop learning rules.

One of these models was the Hopfield networkSathasivam, Saratha (2008). Logic Learning in Hopfield Networks., developed in the 1970s–80s by several researchersHebb, D. O.. The organization of behavior: A neuropsychological theory. Lawrence Erlbaum, 2002.. In this network, each “neuron” is connected to every other by a symmetric weight, but no self-connections (there are only connections between neurons, no self-loops). Unlike the multilayer perceptrons and other architectures we studied, the Hopfield network is an undirected graph since the edges go both ways.

Press + to interact
Simplified architecture of the Hopfield network
Simplified architecture of the Hopfield network

The neurons in the Hopfield network take on binary values, either (1,1)(-1, 1) or (0,1)(0, 1), as a thresholded version of the tanh or sigmoidal activation function:

The threshold values (sigma) never change during training; to update the weights, a Hebbian approach is to use a set of nn binary patterns (configurations of all the neurons) and update as:

Where nn is the number of patterns, and ϵ\epsilon is the binary activations of neurons ii and jj in a particular configuration. Looking at this equation, you can see that if the neurons share a configuration, the connection between them is strengthened, while if they are opposite signs (one neuron has a sign of +1+1, the other 1-1), it is weakened. Following this rule to iteratively strengthen or weaken a connection leads the network to converge to a stable configuration that resembles a memory for a particular activation of the network, given some input. This represents a model for associative memory in biological organismsSuzuki, Wendy A. (2005). Associative Learning and the Hippocampus. Psychological Science Agenda. American Psychological Association. https://www.apa.org/science/about/psa/2005/02/suzuki —the kind of memory that links unrelated ideas, just as the neurons in the Hopfield network are linked togetherHammersley, J. M.; Clifford, P. (1971), Markov fields on finite graphs and lattices; Clifford, P. (1990), Markov random fields in statistics, in Grimmett, G. R.; Welsh, D. J. A. (eds.), Disorder in Physical Systems: A Volume in Honour of John M. Hammersley, Oxford University Press, pp. 19–32.

Besides representing biological memory, Hopfield networks also have an interesting parallel to electromagnetism. If we consider each neuron as a particle or charge, we can describe the model in terms of a free energy equation that represents how the particles in this system mutually repulse/attract each other and where on the distribution of potential configurations the system lies relative to equilibrium:

Where ww is the weights between neurons ii and jj ...

Get hands-on with 1400+ tech skills courses.