Lecture

What Does It Mean for AI to 'Learn'?

Training an AI involves extracting features from many example data to learn patterns, enabling the AI to accurately handle new data.

From a more technical perspective, training an AI involves creating an algorithm (a step-by-step procedure for solving a task) that determines the output for newly input data.

Let's explore the AI learning process in detail through an example of training an AI to classify spam emails.

1. Data Collection and Preprocessing

First, prepare a large amount of email data for the AI model to learn from, and transform this data into a format the AI model can understand. This process is known as preprocessing.

For example, converting gender input to 1 for male and 0 for female, or transforming certain words into numbers following a specific rule, are part of preprocessing.

Also, handling missing data and removing duplicate data are important preprocessing tasks.

2. Pattern Analysis (Learning Process)

The AI model extracts features from the data and identifies patterns tailored to the AI model's purpose.

Data Input: Input the email text into the model (algorithm).
Pattern Recognition: The model analyzes various features of emails (e.g., frequency of certain words, sender's address, length of the email, etc.). Initially, it randomly combines these features but gradually learns which features are significant for identifying spam emails.
Repeated Learning: By repeating this process thousands or even millions of times, the model becomes increasingly accurate at recognizing patterns of spam emails.

3. Storage of Learned Information

The learned patterns are stored as files. Common file extensions for such stored files include .h5, .pkl, .pb, among others.

The internal structure of these files consists of matrices or vectors composed of numerous numbers. These numbers indicate how significantly the AI model considers each feature.

Key Terms

Weights: Determine the importance of specific features of the input data. For example, words like free, win, click might be assigned higher weights to increase the likelihood of classifying them as spam.
Bias: Bias is a value that adjusts the output of a model to prevent it from skewing in a particular direction, modifying the activation function of a neural network. For instance, if emails are generally more likely to be spam, the bias value reflects this, making it easier for the model to identify spam emails. It adjusts the prediction even in the absence of specific words.

The concept of weights and bias can be explained with the following formula:

Weights and Bias Formula

y = w1x1 + w2x2 + ... + wnxn + b

Here, y is the model's output (final result), w represents the weights, x is the input data, and b is the bias.

The bias b helps activate a neuron even if all input values are zero. In other words, it adjusts the result to calibrate the activation threshold of neurons.

Weights and Bias Saved as Files

Example of Weights and Bias Saved as Files

* Weights Matrix:
[
    [0.2, -0.4, 0.6, 0.1],
    [-0.3, 0.8, -0.5, 0.2],
    [0.1, -0.2, 0.3, -0.6],
    [0.7, 0.1, -0.4, 0.5]
]

* Bias Vector:
[0.1, -0.2, 0.3, 0.4]

Here, the weights matrix consists of four rows and four columns, with each weight element indicating the importance of a specific feature that the model has learned.

The bias vector contains four bias values corresponding to each row, representing values additionally added during the model's prediction.

The signs and magnitudes demonstrate how the AI model evaluates each feature. A positive (+) value may imply a positive influence, while a negative (-) value can imply a negative influence.

4. Model Utilization

Once trained, the model is ready to process new emails.

New Data Input: Input a new email into the model.
Pattern Matching: The model analyzes the email’s features using the saved weights and predicts whether it is spam.
Result Output: The model outputs whether the email is spam or a normal email.

Quiz

0 / 1

Which of the following statements about weights and biases is correct?

Weights represent the importance of specific features learned by the model.

Bias is a value that is multiplied when making predictions.

The sign of a weight must always be positive.

Bias is always larger than the weight matrix.