Transforming Values to Probabilities with the Sigmoid Function
The Sigmoid Function
is used to transform input values into outputs between 0 and 1.
This characteristic is particularly useful in machine learning models, such as those used to probabilistically classify spam emails.
Probability of being spam: 0.85 (85%) → Classified as spam Probability of being spam: 0.20 (20%) → Classified as normal
How the Sigmoid Function Works
The Sigmoid function outputs a value between 0
and 1
regardless of how large or small the input is.
The outputs of the Sigmoid function have the following characteristics:
-
Outputs a value close to 1 for large input values
-
Outputs a value close to 0 for small input values (e.g., negative)
-
Outputs 0.5 when the input is 0
In this way, the Sigmoid function translates input values into outputs that can be interpreted as probabilities.
Input: 10 → Output: 0.99 (Almost 1) Input: 2 → Output: 0.88 Input: 0 → Output: 0.50 Input: -2 → Output: 0.12 Input: -10 → Output: 0.00 (Almost 0)
The larger the input is, the closer the output is to 1, and the smaller the input is, the closer the output is to 0.
Limitations of the Sigmoid Function
While the Sigmoid function provides an intuitive probabilistic interpretation, it also has some limitations.
1. Minimal Change for Extreme Values
For extremely large or small input values, the Sigmoid output becomes very close to either 0 or 1, resulting in minimal change.
This is known as the Vanishing Gradient
problem, and other functions like ReLU
are employed in deep learning to address this issue.
2. Sharp Changes Only Near the Middle
Significant changes in the Sigmoid output occur mainly when input values are between -2 and 2. Outside this range, changes are minimal.
The Sigmoid function is a powerful tool for making probabilistic predictions in AI models.
In the next lesson, we'll tackle a simple quiz to apply what we've learned so far.
Which of the following is true about the characteristics of a sigmoid function?
When the input value is large, it outputs a value close to 0.
When the input value is small, it outputs a value close to 1.
When the input value is 0, it outputs 0.5.
Regardless of the input value, it always outputs 1.
Lecture
AI Tutor
Design
Upload
Notes
Favorites
Help