Limitations of a Single-Layer Perceptron
A perceptron is a simple algorithm that takes input values, applies a threshold, and divides these inputs into two groups, much like classifying students as pass (1)
or fail (0)
based on their grades.
For instance, if the test score is 50
or above, it's considered a pass (1)
, while below 50
is considered a fail (0)
.
Problems that can be split by a single line (threshold)
can be easily solved by a single-layer perceptron.
However, not all problems are this straightforward.
Problems a Perceptron Cannot Solve
Consider a rule that outputs 1
only when the two input values are different (0,1 or 1,0)
.
Input 1 () | Input 2 () | Output () |
---|---|---|
0 | 0 | 0 |
0 | 1 | 1 |
1 | 0 | 1 |
1 | 1 | 0 |
This is known as the XOR problem.
When represented in an (x, y) coordinate system, it looks like this:
-
(0,0)
and(1,1)
belong to the same group since they have an output of0
-
(0,1)
and(1,0)
belong to the same group since their output is1
What happens when you try to separate these two groups with a single straight line?
No matter how you draw the line, you cannot perfectly separate these groups.
A perceptron cannot solve this problem of classifying these groups with just one line.
Problems that cannot be separated linearly by a straight line are called non-linear problems. These require more complex shapes, like curves, to solve.
Solving Non-Linear Problems with a Multi-Layer Perceptron
A Multi-Layer Perceptron (MLP)
is an artificial neural network composed of multiple perceptrons arranged in layers.
Unlike a single-layer perceptron, which classifies data with a single line, a multi-layer perceptron transforms input data through several layers, enabling it to learn more complex patterns.
Between the input layer and the output layer in a multi-layer perceptron, there exists a hidden layer
.
In the hidden layer, weights
are applied to the input data, and an activation function
is used to generate new features.
Through this transformation process, data that couldn't be linearly separated in its original form is transformed into a new dimension.
In the XOR problem, for example, single-layer separation in the original 2D space is not possible, but after going through the hidden layer, the data can be converted to a higher-dimensional space
(e.g., a three-dimensional space) where it is possible to separate linearly.
Thus, a multi-layer perceptron utilizes hidden layers to transform input data and solve non-linear problems.
A single-layer perceptron can solve the XOR problem.
Lecture
AI Tutor
Design
Upload
Notes
Favorites
Help