Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags


How to implement the softmax function in Python

Talha Ashar


The softmax function is a mathematical function that converts a vector of real values into a vector of probabilities that sum to 1. Each value in the original vector is converted to a number between 0 and 1.

The formula of the softmax function is shown below:

σ(z)i=ezij=1Kezj\sigma (\vec{z})_{i} = \frac{{e}^{z_{i}}}{\sum_{j=1}^{K}{e}^{z_{j}}}

As shown above, the softmax function accepts a vector z of length K. For each value in z, the softmax function applies the standard exponential function to the value. It then divides it by the sum of the exponents of each value in z.


Consider the following vector:

z = [5, 2, 8]

First, let’s calculate the exponential of each value in z.

ez1e^{z_{1}} = e5e^{5} = 148.4

ez2e^{z_{2}} = e2e^{2} = 7.4

ez3e^{z_{3}} = e8e^{8} = 2981.0

Next, we can calculate the sum of the exponentials:

j=1Kezj{\sum_{j=1}^K e^{z_{j}}} = ez1+ez2+ez3e^{z_{1}} + e^{z_{2}} + e^{z_{3}}

j=1Kezj{\sum_{j=1}^K e^{z_{j}}} = 148.4 + 7.4 + 2981.0 = 3136.8

Finally, we can calculate the softmax equivalent for each value in z, as shown below:

σσ(z1z_{1}) = 148.43136.8\frac{148.4}{3136.8} = 0.0473

σσ(z2z_{2}) = 7.43136.8\frac{7.4}{3136.8} = 0.0024

σσ(z3z_{3}) = 2981.03136.8\frac{2981.0}{3136.8} = 0.9503

So, we end up with a vector of probabilities:

Softmax(z) = [0.0473, 0.0024, 0.9503]


The code below shows how to implement the softmax function in Python:

import math

# softmax function
def softmax(z):

	# vector to hold exponential values
	exponents = []

	# vector to hold softmax probabilities
	softmax_prob = []

	# sum of exponentials
	exp_sum = 0

	# for each value in the input vector
	for value in z:

		# calculate the exponent
		exp_value = math.exp(value)

		# append to exponent vector

		# add to exponential sum
		exp_sum += exp_value
	# for each exponential value
	for value in exponents:

		# calculate softmax probability
		probability = value / exp_sum

		# append to probability vector
	return softmax_prob

# define vector
z = [5, 2, 8]

# find softmax
result = softmax(z)


In the code above:

  • Line 1: We import the math library.
  • Line 4: We define the softmax function that accepts a vector as a parameter.
  • Lines 7-13: We declare three variables to store the exponential of each value, the corresponding probability, and the sum of all exponentials, respectively.
  • Lines 16-25: We use a for-loop to iterate over each value in the given array. We first calculate its exponential for each value through the math.exp() function, and append the value to exponents. The sum of exponentials is also updated in each iteration of the loop.
  • Lines 28-34: We use another for-loop to find the probability corresponding to each exponential value by dividing the value by exp_sum. Each probability is appended to softmax_prob.
  • Lines 39-43: We declare a vector z containing three values and pass it to the softmax function. The vector returned by the function is output accordingly.



View all Courses

Keep Exploring