Lecture

Data Formats Used for Training AI

In order to train AI models, data must be transformed into a format that AI can understand.

In this lesson, we'll explore the key data formats used to train AI, including CSV, JSON, and XML.


CSV

CSV, which stands for Comma-Separated Values, is used to store and transmit data in a table format.

Each row represents an individual data entry, while each column represents a specific attribute of that data. The values in each column are separated by commas.

For example, a CSV file that stores students' math and English grades by name could be represented as follows:

CSV Example
Name,Math,English John Doe,85,90 Jane Smith,88,80

CSV files are saved with the .csv file extension and can be easily opened and edited with various data management programs like Microsoft Excel, Google Sheets, or database software.


JSON

JSON (JavaScript Object Notation) is primarily used for storing and exchanging data in web and mobile applications.

JSON is composed of objects and arrays; objects are enclosed in curly braces { }, and arrays are enclosed in square brackets [ ].

For more details, see the next lesson.


JSON Example
// An array enclosed in square brackets [ // An object enclosed in curly braces { "Name": "John Doe", "Math": 85, "English": 90 }, { "Name": "Jane Smith", "Math": 88, "English": 80 } ]

XML

XML (eXtensible Markup Language) is mainly used to represent the hierarchical structure of data.

The key elements of XML are as follows:

  1. Tags: Data enclosed within < >, expressing the hierarchical structure.

    • Tags are divided into start tags and end tags.
    • A start tag is denoted by <tagname>, and an end tag by </tagname>.
  2. Attributes: Used to provide additional information within a tag.

    • To add an attribute to a tag, use the format <tagname attributename="attributevalue">.
    • Example: <Student gender="male"> is an example of adding a gender attribute to a Student tag.

Below is the JSON example represented in XML.

XML Example
<StudentList> <Student> <Name>John Doe</Name> <Math>85</Math> <English>90</English> </Student> <Student> <Name>Jane Smith</Name> <Math>88</Math> <English>80</English> </Student> </StudentList>

In addition, when training image-related AI models, images are used as training data, and text files (.txt) are often used for training natural language processing models.

Mission
0 / 1

다음 빈칸에 들어갈 말로 가장 적합한 단어는 무엇일까요?

에서 각 행(가로줄)은 하나의 데이터를 표현하며, 각 열은 쉼표(,)로 구분합니다.
CSV
JSON
XML
HTML

Lecture

AI Tutor

Publish

Design

Upload

Notes

Favorites

Help