lesson1Title

lesson2Title

lesson3Title

lesson4Title

lesson5Title

lesson6Title

lesson7Title

lesson8Title

lesson9Title

lesson10Title

lesson11Title

lesson12Title

lesson13Title

lesson14Title

lesson15Title

lesson16Title

lesson17Title

aiFundamentalsDeepLearningChapter1Title

aiFundamentalsDeepLearningChapter2Title

aiFundamentalsDeepLearningChapter3Title

aiFundamentalsDeepLearningChapter4Title

aiFundamentalsDeepLearningChapter5Title

# The ReLU Function: Activating Only the Positive  

The `ReLU(Rectified Linear Unit)` function is one of the most widely used activation functions in artificial neural networks. It performs a simple operation: it outputs the input value if it's greater than 0, otherwise it outputs 0. 

In previous lessons, we explored the `Sigmoid Function`, which transforms all values between 0 and 1. In contrast, ReLU removes negative values, leaving only positive values. 

```plaintext title="ReLU Function Example"
Input:  3  → Output: 3
Input:  0  → Output: 0
Input: -5  → Output: 0
```

The ReLU function determines whether a neuron in the network should be activated.	

When the input is positive, it passes the value as is, retaining the information. Conversely, if the input is negative, it simplifies the computation by making it zero.

<br />  

## How the ReLU Function Works  

The ReLU function is defined by the following equation:

$$
\text{ReLU}(x) = \max(0, x)
$$

It outputs the input value as is if it is greater than 0; otherwise, it outputs 0.  

- If the input is positive, it outputs the value as is.

- If the input is zero or negative, it outputs 0.

```plaintext title="ReLU Output Example Based on Input Values"
Input:  5  → Output: 5
Input:  0  → Output: 0
Input: -3  → Output: 0
```

<br />  

## Advantages of the ReLU Function  

The ReLU function is among the most frequently used activation functions in deep learning.

The first advantage is that it addresses the `vanishing gradient` problem.

Unlike the sigmoid function, which can have gradients close to zero for large values making learning difficult, the ReLU does not encounter this issue.

The second advantage is its simplicity and speed in computation.

The ReLU function only performs the `max(0, x)` operation, making it faster than other activation functions like `sigmoid`, which require multiplications and divisions.

<br />  

## Limitations of the ReLU Function  

Despite its advantages, the ReLU function has some downsides. The most notable issue is the `dead neuron` problem.

Because the function outputs 0 for any non-positive input, some neurons can become **permanently inactive** during training.

To address this, variants such as `Leaky ReLU` or `ELU` are often used.

Additionally, very large input values can lead to extremely high outputs, potentially destabilizing the model.	

The `Clipped ReLU`, a variant of the ReLU function, is used to tackle this issue.

<br />  

The ReLU function is one of the most widely used activation functions in deep learning because of its simplicity and computational efficiency, enabling faster model training.

In the next lesson, we will explore the `Softmax` activation function.

The ReLU function outputs the input as it is when it is greater than 0, but always outputs 0 when the input is less than or equal to 0. This is due to its characteristic of eliminating negative values and only activating positives.

### A ReLU function always outputs 0 when the input is less than or equal to 0.