lesson1Title

lesson2Title

lesson3Title

lesson4Title

lesson5Title

lesson6Title

lesson7Title

lesson8Title

lesson9Title

lesson10Title

aiFineTuningBasicsChapter3Title

lesson11Title

lesson12Title

lesson13Title

aiFineTuningBasicsChapter1Title

aiFineTuningBasicsChapter2Title

# Summary of Course Highlights

## What is an AI Model?

A computer program that analyzes given data to learn patterns and rules, subsequently making **predictions and decisions** based on the new data.

 

## What is Fine-Tuning?

The process of **retraining** an already trained AI model to improve its performance for specific tasks or purposes.

 

## What Does it Mean for AI to Learn?

It means extracting features from numerous examples to learn `patterns` and developing the ability to process new data accurately.

Technical Explanation: It involves creating an **algorithm** (a step-by-step procedure for performing tasks) that determines the output for newly inputted data.

 

## The Role of Weights and Biases

AI adjusts its parameters, including weights and biases, to learn the data patterns and make predictions for new data.

| Term | Description |
|-------------|--------------------------------------------------|
| Weights | Determine the importance of specific features in input data |
| Biases | Values that adjust the model to prevent outputs from becoming biased in a specific direction |

```text title="Equation for Weights and Biases"
y = w1x1 + w2x2 + ... + wnxn + b
```

 

## What is JSON?

A **lightweight data format** used for storing and exchanging data.

JSON consists of objects and arrays, with objects enclosed in curly braces `{ }` and arrays in square brackets `[ ]`.

```json title="JSON Example"
// An array enclosed in square brackets
[
 // Objects enclosed in curly braces
 { 
 "name": "John Doe",
 "math": 85,
 "english": 90
 },
 {
 "name": "Jane Smith",
 "math": 88,
 "english": 80
 }
]
```

A `JSONL` file used for fine-tuning stores one JSON formatted data entry per line.

 

## What is a Dataset?

A collection of data gathered and organized for specific purposes, such as AI model training and validation.

| Component | Description |
|---------------|--------------------------------------------------|
| Features | Input data that the model learns from |
| Labels | Correct answers for each input data |
| Metadata | Documentation providing additional information like data source, creation date, etc. |

 

## Types of Datasets

| Component | Description |
|-----------------------|--------------------------------------------------|
| Training Dataset | Used to train the model |
| Validation Dataset | Used to evaluate model performance during training|
| Test Dataset | Used to test model performance |

- *Training Dataset*: About 60-80% of the total data

- *Validation Dataset*: About 10-20% of the total data

- *Test Dataset*: About 10-20% of the total data

 

## What is a Loss Function?

A function that measures the difference between model predictions and actual values, taking predictions and actual values (correct answers) as input and returning the loss (error) as output.

A smaller value of the loss function indicates that the model's predictions are closer to the actual values, and the goal of AI training is to **minimize** the value of the loss function.

The gradient of the loss function, known as the gradient, informs how to adjust the model's parameters to reduce the loss.

 

## Hyperparameters

**Parameters** (settings) established when training an AI model.

| Key Hyperparameters | Description |
|-------------------------------|-------------------------------------------------|
| Learning Rate | Controls the speed of model training |
| Batch Size | Number of data points processed at once |
| Number of Epochs | The number of times the entire dataset is iterated over during training |

Term	Description
Weights	Determine the importance of specific features in input data
Biases	Values that adjust the model to prevent outputs from becoming biased in a specific direction

Component	Description
Features	Input data that the model learns from
Labels	Correct answers for each input data
Metadata	Documentation providing additional information like data source, creation date, etc.

Component	Description
Training Dataset	Used to train the model
Validation Dataset	Used to evaluate model performance during training
Test Dataset	Used to test model performance

Key Hyperparameters	Description
Learning Rate	Controls the speed of model training
Batch Size	Number of data points processed at once
Number of Epochs	The number of times the entire dataset is iterated over during training