lesson1Title

lesson2Title

lesson3Title

lesson4Title

lesson5Title

lesson6Title

lesson7Title

lesson8Title

lesson9Title

lesson10Title

aiFineTuningBasicsChapter3Title

lesson11Title

lesson12Title

lesson13Title

aiFineTuningBasicsChapter1Title

aiFineTuningBasicsChapter2Title

# Number of Epochs in Training

An `epoch` is defined as one complete pass through the entire training dataset. In other words, an epoch occurs when the model has utilized all the training data once.

For instance, if a dataset consists of 1,000 samples and you train the model for 10 epochs, the model will go through these 1,000 samples ten times.

The unit that processes a batch within an epoch is called a `step`, and an epoch is essentially a **collection of steps**.

For example, if a dataset contains 1,000 samples and the batch size is 100, one epoch would require 10 steps.

<br />

## Commonly Used Number of Epochs

The number of epochs varies depending on the dataset, model complexity, and the nature of the problem. Typically, it ranges from several tens to several hundreds of epochs.

- *Simple models or small datasets*: 10-50 epochs

- *Medium-sized models or datasets*: 50-200 epochs

- *Large models and datasets*: 200 epochs or more

Generally, the optimal number of epochs is determined experimentally, and techniques such as `early stopping`, which halts training when performance on the validation dataset no longer improves, can help adjust the optimal number of epochs.

<br />

## Pros and Cons of Having a High Number of Epochs

### Pros
- *Increased convergence likelihood*: With a low learning rate and a high number of epochs, the model is more likely to converge to optimal weights.

- *Tackling complex problems*: For very complex problems or datasets, a sufficient number of epochs allows the model to better understand and solve the problem.

<br />

### Cons
- *Overfitting*: Training with too many epochs can lead to overfitting, where the model aligns too closely to the training data, resulting in poor generalization to test or real-world data.

- *Time-consuming*: More epochs extend training time, potentially leading to a waste of computational resources and time.

During one epoch, the model trains on every sample in the dataset.

### What is the most appropriate word to fill in the blank?