Lecture

Saving Extracted Article Data to a CSV File

In this lesson, we will learn how to save the extracted BBC News article titles into a CSV file.

CSV (Comma Separated Values) represents data separated by commas.

CSV files can be easily opened in spreadsheet programs like Excel and Google Sheets, and are commonly used to store and load data.


Code to Extract Article Titles from the BBC Website

Python’s built-in csv module allows you to conveniently process and save data in CSV format.

Save Extracted Data to CSV
import csv from io import StringIO from bs4 import BeautifulSoup import requests # Send a request to the BBC News homepage url = "https://www.bbc.com/news" response = requests.get(url) # Parse the HTML data soup = BeautifulSoup(response.text, "html.parser") # Extract 10 article titles from h2 HTML tags titles = soup.find_all('h2', limit=10) # Prepare data to save in CSV format data = [] for title in titles: # Extract only the article title data.append([title.text])

The above code extracts article titles from the BBC News website that have h2 HTML tags and outputs them in CSV format.

The h2 (Heading2) tag denotes the title on a web page, and article titles on internet news are generally written using h1 (main title), h2 (subtitle), and h3 (sub-subtitle) tags.

The code soup.find_all('h2', limit=10) extracts up to 10 article titles written with h2 tags from the HTML data.


Saving as a CSV File

Output Data as CSV Format
# Create a StringIO object (temporarily stores data in memory like a file) output = StringIO() # Save the StringIO object as a CSV file csv_writer = csv.writer(output) # Add headers to the CSV csv_writer.writerow(['Number', 'Article Title']) # Add the number and article title to the CSV for idx, title in enumerate(titles, 1): csv_writer.writerow([idx, title.text.strip()]) # Output the result in CSV format print(output.getvalue()) # Close the StringIO object output.close()

The StringIO object can temporarily save data in memory like a file.

Using csv.writer, you write the CSV file, and by calling csv_writer.writerow(), you add headers and data.

Finally, by calling output.getvalue(), you output the result in CSV format.


Now, by executing the above code, the extracted BBC News article titles will be output in CSV format.

To save the data as a CSV file on your computer instead of displaying it, add the following code:

Save as CSV File
# Save the StringIO object as a CSV file with open('bbc_news.csv', 'w', newline='') as f: f.write(output.getvalue())

The above code generates a bbc_news.csv file in the location where the Python script is executed on your computer, and saves the data.

Due to security reasons, file saving is restricted in the practice environment.

To download the CSV file, use the Download button in the practice environment menu next to the Run button. :)

Lecture

AI Tutor

Design

Upload

Notes

Favorites

Help

Code Editor

Run
Generate

Execution Result