How Generative AI Understands Prompts
As introduced earlier, AI is a Function
.
However, unlike simple functions like f(x) = 3x + 2 that we learned in math, these are extremely complex functions like f(countless variables) = wide range of output possibilities
that are difficult for ordinary people to understand.
Similar to how human intelligence emerges from the brain, AI's intelligence comes from models composed of complex functions.
AI models learn from data to create neurons (neuronal cells of the brain) and use this to solve given problems.
The neurons in recently released generative AI use a model called
Transformer
.
Transformers analyze the given prompt by breaking it down into sub-units like words and tokens, and predict the next word probabilistically to generate a sentence.
The process of how AI understands a prompt can be divided into four major stages.
1. Tokenization
A token
is a fundamental unit that AI uses to break down input text into smaller components, such as words, punctuation, or numbers. For example, when given the sentence "The cat climbed the tree."
, AI tokenizes it as follows:
The / cat / climbed / the / tree
Each token allows AI to understand the context and meaning of the input. Different AI models may define tokens differently, but the general process remains the same.
Tokenizing English Text
English tokenization primarily uses spaces or punctuation to separate words.
Example: "The quick brown fox jumps over the lazy dog."
Tokenizing this sentence breaks it into 10 tokens as follows:
- The
- quick
- brown
- fox
- jumps
- over
- the
- lazy
- dog
- .
Here, each word and punctuation mark becomes a token.
Even a single word can be split into several tokens based on prefixes, patterns, and suffixes. For example, the word "unconscious" can be split into sub-elements such as un (a prefix indicating negation), con (a common pattern in English words), and ious (a frequent English suffix), resulting in the recognition as three tokens.
Tokenizing Other Languages
Tokenizing languages like Korean can be more complex. Due to the abundance of postpositions and verb endings, it often uses morpheme-based tokenization (the smallest meaning unit in a language) instead of simple word-based tokenization.
Example: "I was reading a book at the library."
This sentence can be tokenized into 11 tokens as follows:
- I (pronoun)
- was (verb)
- reading (verb)
- a (article)
- book (noun)
- at (preposition)
- the (article)
- library (noun)
Usually, even with the same number of words, tokenization in a language like Korean involves more tokens than English.
The way tokens are processed varies depending on the type of characters being handled by the AI model. ChatGPT typically allocates 1 token per 1-4 English letters, while handling languages like Korean at a morpheme level.
Note: AI models like ChatGPT process text based on tokens, and usage costs are often calculated per token. Efficient tokenization can reduce unnecessary computational expenses.
2. Embedding
The tokenized words are converted into numeric vectors
. For example, the word cat might be converted into a vector like:
[0.11, 0.34, 0.56, ...]
Words with similar meanings have similar vector values. For instance, the vector value for dog can be similarly converted as:
[0.12, 0.84, 0.32, ...]
Words with similar vector values are closely positioned in the vector space.
3. Context Understanding
AI understands the context of a sentence by using the vector values of the tokenized words. For instance, when the words cat and tree appear together, AI identifies the kind of relationship these two words have.
This involves using an Attention Mechanism
, which calculates how each word in a sentence is connected to others, giving higher weight to more important words.
The Transformer
model, a type of AI neural network model, utilizes the attention mechanism to identify relationships between all words simultaneously. For example, it understands how cat and tree interact and identifies cat as the subject performing the action of climbing.
ChatGPT, based on the Transformer model, uses the attention mechanism to grasp the context of the prompt.
4. Generating Responses
AI predicts the first word based on the input vectors. In this process, AI uses its pre-trained language model to select the most appropriate word within the given context, such as "cat" being the first predicted word included in the response.
Once the first word is predicted, AI includes this word in the context and predicts the next word. This process repeats until the response completes, with AI predicting and generating each subsequent word based on all previously generated ones.
-
When predicting the next word after
The cat
, AI selectsis
-
When predicting the next word after
The cat is
, AI selectsa
AI continues to create words until a coherent sentence is constructed in line with the prompt.
The cat is a small animal often kept as a pet at home.
Try It Out
Experiment with different prompts and observe how AI processes and generates responses based on context and structure.
Tokenization refers to the process of splitting a sentence into words, punctuation, numbers, etc.
Lecture
AI Tutor
Design
Upload
Notes
Favorites
Help