Lecture

Organizing Multiple Probabilities with the Softmax Function

The Softmax Function is a function that converts multiple numbers into probabilities.

Previously, the Sigmoid Function we learned about transformed a single number into a value between 0 and 1. In contrast, the Softmax function adjusts several numbers so that when summed, they total to 1.

Because of this property, it is frequently used in Multi-Class Classification problems.

Image Classification Example
Input Image: Cat Photo Output Probabilities: Cat: 0.80 (80%) Dog: 0.15 (15%) Rabbit: 0.05 (5%)

As seen here, the Softmax function allows the model to convert its predicted values into probabilities to select the most likely class.


How the Softmax Function Works

The Softmax function is defined by the following formula:

P(yi)=ezij=1nezjP(y_i) = \frac{e^{z_i}}{\sum_{j=1}^{n} e^{z_j}}

Each number (ziz_i) is transformed using the exponential function (exe^x) and then divided by the sum of all these values to produce probability values.

This ensures the sum of the probabilities is always 1.

Softmax Output Example Based on Input Values
Input: [2.0, 1.0, 0.1] Output: [0.65, 0.24, 0.11] (Sum of probabilities = 1)

The larger the input value, the higher the probability; smaller values yield lower probabilities.


Advantages of the Softmax Function

The Softmax function makes the results of multi-class classification problems easier to interpret.

By converting all outputs to probability values, it allows for easy selection of the most likely class.

Additionally, it provides an intuitive understanding of how confident the model is in its predictions.


Limitations of the Softmax Function

Since the Softmax function transforms the probabilities of each class into relative values, the probability of a certain class can be influenced by other classes.

In other words, as the probability of one class increases, the probabilities of other classes decrease.

Moreover, if the predicted values are extremely large or small, one value may approach 1 while others remain nearly 0, making training difficult.

To address this, techniques for appropriately adjusting output values are necessary.


The Softmax function is an essential tool for performing multi-class classification in machine learning.

In the next lesson, we will compare the activation functions we have learned so far.

Mission
0 / 1

Which of the following is most appropriate to fill in the blank?

The softmax function converts multiple numbers into probabilities, adjusting them so that their sum is .
0
1
100
1000

Lecture

AI Tutor

Design

Upload

Notes

Favorites

Help