Key Methods and Usage of BeautifulSoup
In this lesson, we will look at the key methods of BeautifulSoup
and how to use them with some simple examples.
Finding a Specific Element with find
To find a specific element on a web page, you can use the find()
method.
This method returns the first element
that meets the criteria.
from bs4 import BeautifulSoup html_doc = """ <html><body> <h1>Hello</h1> <p>Paragraph 1</p> <p>Paragraph 2</p> </body></html> """ # Parse the HTML soup = BeautifulSoup(html_doc, 'html.parser') # Find the h1 tag h1_tag = soup.find('h1') # Output: Hello print(h1_tag.text)
In the example above, it finds the h1
tag and prints its content.
find()
always returns only the first matching element, so if there are multiple elements, only the first one is returned.
Finding Multiple Elements with find_all
If you want to find all elements that meet the criteria, use the find_all()
method.
This method returns the results as a list, allowing you to handle multiple elements at once.
from bs4 import BeautifulSoup html_doc = """ <html><body> <p>Paragraph 1</p> <p>Paragraph 2</p> <p>Paragraph 3</p> </body></html> """ # Parse the HTML soup = BeautifulSoup(html_doc, 'html.parser') # Find all p tags p_tags = soup.find_all('p') # Print all p tags for p in p_tags: # Output: Paragraph 1, Paragraph 2, Paragraph 3 print(p.text)
This code finds and prints all p
tags in the string held by the html_doc
variable.
The p_tags
variable holds the values of the p tags in a list like ['Paragraph 1', 'Paragraph 2', 'Paragraph 3']
.
Thus, find_all()
is useful when you want to find multiple elements at once.
Finding Elements Using CSS Selectors with select
To select a specific element using CSS selectors, use select()
.
from bs4 import BeautifulSoup html_doc = """ <html><body> <p>Paragraph 1</p> <div class="content"> <p>Paragraph 2</p> <p>Paragraph 3</p> </div> </body></html> """ # Parse the HTML soup = BeautifulSoup(html_doc, 'html.parser') # Find all p tags within .content class content_p_tags = soup.select('.content p') for p in content_p_tags: # Output: Paragraph 2, Paragraph 3 print(p.text)
This code selects and prints all p
tags within the .content
class.
Selecting the First Element with select_one
The select_one()
method is similar to select()
, but it returns only the first element that meets the criteria.
from bs4 import BeautifulSoup html_doc = """ <html><body> <div class="content"> <p>Paragraph 1</p> <p>Paragraph 2</p> </div> </body></html> """ # Parse the HTML soup = BeautifulSoup(html_doc, 'html.parser') # Find the first p tag within .content class first_p_tag = soup.select_one('.content p') # Output: Paragraph 1 print(first_p_tag.text)
In the above example, it finds and prints the first p
tag within the .content
class.
What is the method to find the first matching element in BeautifulSoup?
find_all()
select()
find()
select_one()
Lecture
AI Tutor
Design
Upload
Notes
Favorites
Help
Code Editor
Execution Result