# Importing requests and BeautifulSoup libraries
import requests
from bs4 import BeautifulSoup

# Wikipedia homepage URL
url = "https://www.wikipedia.org"

# Fetch HTML from the URL using the requests library
response = requests.get(url)

# Set the encoding of the fetched HTML to UTF-8
response.encoding = 'utf-8'

# Process the fetched HTML using BeautifulSoup and store it in the soup variable
soup = BeautifulSoup(response.text, 'html.parser')

# Extract h1 (heading 1, title) from the webpage
h1_title = soup.find('h1').text

# Extract p (paragraph) tag from the webpage
p_description = soup.find('p').text

print('Title:')
print(h1_title) # Print the title

print('-' * 10) # Separator line

print('Description:')
print(p_description) # Print the description

# Fetching Wikipedia Homepage Information with Python

Wikipedia is an online encyclopedia collaboratively built by people around the world. 📘

In this lesson, we'll use Python code to collect specific information from a Wikipedia page.

Using the `BeautifulSoup` and `requests` libraries, we can fetch the title and description of the Wikipedia homepage, as shown below.

<br />

## Step 1: Import Required Libraries

```python title="Importing requests and BeautifulSoup Libraries"
import requests
from bs4 import BeautifulSoup
```

The above code performs the following tasks:

- Uses the `import` keyword to load the **requests** library for HTTP communication

- Uses the `from` keyword to load the **bs4** package for collecting webpage data and imports the **BeautifulSoup** class from it

<br />

## Step 2: Fetch HTML from URL and Store It in a Variable

Use **BeautifulSoup** to fetch and store the HTML of a webpage in a variable, as shown below:

```python title="Fetching HTML from Wikipedia Homepage"
# Wikipedia homepage URL
url = "https://www.wikipedia.org"

# Fetch HTML from the URL using the requests library
response = requests.get(url)

# Set the encoding of the fetched HTML to UTF-8
response.encoding = 'utf-8'

# Store the fetched HTML in the soup variable
soup = BeautifulSoup(response.text, 'html.parser')
```

The above code performs the following tasks:

- Stores the Wikipedia homepage URL in the `url` variable

- Fetches HTML from the URL using `requests.get(url)`

- Parses the fetched HTML using `BeautifulSoup(response.text, 'html.parser')` and stores the parsed result in the **soup** variable

<br />

## Step 3: Extract Title and Description Information

Extract the desired information from the **soup** variable as shown below:

```python title="Extracting Title and Description from Wikipedia Homepage"
# Extract h1 (heading 1, title) from the webpage
h1_title = soup.find('h1').text

# Extract p (paragraph) tag from the webpage
p_description = soup.find('p').text
```

The above code performs the following tasks:

- Uses `soup.find('h1').text` to find the **h1** tag in the **soup** variable, extracts the title, and stores it in the **h1_title** variable

- Uses `soup.find('p').text` to find the **p** tag in the **soup** variable, extracts the description, and stores it in the **p_description** variable

Finally, use the print function to display the title and description fetched from the URL.

<br />

## Practice

Click the _`Run Code`_ button on the right-hand side to see the scraping results.
The first execution of the code may take some time.

You can also modify the `url` address in the code (e.g., `https://www.codefriends.net`) to fetch information from other webpages.

BeautifulSoup is a Python library used to extract data from HTML and XML files. It is very useful for parsing HTML and extracting necessary information during web scraping tasks.