Binary Cross-Entropy Loss in PyTorch

Uncover the different ways you can compute the binary cross-entropy loss in PyTorch.

We'll cover the following


Sure enough, PyTorch implements the binary cross-entropy loss, [nn.BCELoss]. Just like its regression counterpart, MSELoss (introduced in the chapter, A Simple Regression Problem), it is a higher-order function that returns the actual loss function.

The BCELoss higher-order function takes two optional arguments (the others are deprecated, and you can safely ignore them):

  • reduction: It takes either mean, sum, or none. The default mean corresponds to our equation 6.15 in the previous lesson. As expected, sum will return the sum of the errors instead of the average. The last option, none, corresponds to the unreduced form; that is, it returns the full array of errors.

  • weight: The default is none. Meaning, every data point has equal weight. If informed, it needs to be a tensor with a size that equals the number of elements in a mini-batch, representing the weights assigned to each element in the batch. In other words, this argument allows you to assign different weights to each element of the current batch based on its position. So, the first element would have a given weight, the second element would have a different weight, and so on. This is regardless of the actual class of that particular data point. Sounds confusing? Weird? Yes, this is weird. Of course, this is not useless or a mistake, but the proper usage of this argument is a more advanced topic and outside the scope of this course.

Get hands-on with 1200+ tech skills courses.