python——爬虫学习——基于bs4库的HTML内容查找方法-(3)

最新推荐文章于 2025-03-13 16:58:19 发布

原创

最新推荐文章于 2025-03-13 16:58:19 发布 · 1.4k 阅读

收录于

当前文章被以下社区和专栏收录：

find_all()

find_all( name , attrs , recursive , string , **kwargs )

返回一个列表类型，存储查找的结果

name : 对标签名称的检索字符串:

>>> import requests
>>> r=requests.get("http://python123.io/ws/demo.html")
>>> demo = r.text
>>> soup = BeautifulSoup(demo,'html.parser')

>>> soup.find_all('a')
[<a class="py1" href="http://www.icourse163.org/course/BIT-268001" id="link1">Basic Python</a>, <a c
lass="py2" href="http://www.icourse163.org/course/BIT-1001870001" id="link2">Advanced Python</a>]
>>> soup.find_all(['a','b'])
[<b>The demo python introduces several python courses.</b>, <a class="py1" href="http://www.icourse163.org/course/BIT-268001" id="link1">Basic Python</a>, <a class="py2" href="http://www.icourse163.org/course/BIT-1001870001" id="link2">Advanced Python</a>]

>>> for tag in soup.find_all(True):
...     print(tag.name)
...
html
head
title
body
p
b
p
a
a
>>> import re
>>> for tag in soup.find_all(re.compile('b')):
...     print(tag.name)
...
body
b