Beautiful Soup

Beautiful Soup
原作者	Leonard Richardson
目前版本	Module:EditAtWikidata第29行Lua錯誤：attempt to index field 'wikibase' (a nil value)
原始碼庫	{{URL\|example.com\|可选的显示文本}}; Module:EditAtWikidata第29行Lua錯誤：attempt to index field 'wikibase' (a nil value)
程式語言	Python
引擎	Module:EditAtWikidata第29行Lua錯誤：attempt to index field 'wikibase' (a nil value)
類型	HTML解析庫、網絡數據採集
特許條款	Python軟件基金會特許條款（Beautiful Soup 3及以前）; MIT特許條款（Beautiful 4及以後）
網站	www.crummy.com/software/BeautifulSoup/

Beautiful Soup是一個Python包，功能包括解析HTML、XML文件、修復含有未閉合標籤等錯誤的文件（此種文件常被稱為tag soup）。這個擴充包為待解析的頁面建立一棵樹，以便提取其中的數據，這在網絡數據採集時非常有用。^[1]

在2021年，Python 2.7的官方支援終止，BeautifulSoup發行版4.9.3是支援Python 2.7的最後版本^[2]。

範例代碼[編輯]

#!/usr/bin/env python3
# Anchor extraction from HTML document
from bs4 import BeautifulSoup
from urllib.request import urlopen
with urlopen('https://en.wikipedia.org/wiki/Main_Page') as response:
    soup = BeautifulSoup(response, 'html.parser')
    for anchor in soup.find_all('a'):
        print(anchor.get('href', '/'))

參見[編輯]

HTML解析器對比

參考資料[編輯]

^ ^1.0 ^1.1 Beautiful Soup website. [18 April 2012]. （原始內容存檔於2017-02-03）. Beautiful Soup is licensed under the same terms as Python itself
^ Richardson, Leonard. Beautiful Soup 4.10.0. beautifulsoup. Google Groups. 7 Sep 2021 [27 September 2022]. （原始內容存檔於2022-09-29）.

小作品圖示

這是一篇關於電腦程式語言的小作品。您可以透過編輯或修訂擴充其內容。

[crummy.com-1] 1.0 ^1.1 Beautiful Soup website. [18 April 2012]. （原始內容存檔於2017-02-03）. Beautiful Soup is licensed under the same terms as Python itself

[2] Richardson, Leonard. Beautiful Soup 4.10.0. beautifulsoup. Google Groups. 7 Sep 2021 [27 September 2022]. （原始內容存檔於2022-09-29）.

[1]

[2]

Beautiful Soup

範例代碼[編輯]

參見[編輯]

參考資料[編輯]

導覽菜單

搜尋