python爬取网页

作者：死亡之神发布日期:2025-11-25 浏览:149

import requests
from bs4 import BeautifulSoup

# 定义一个函数来爬取网页内容
def fetch_webpage(url):
    try:
        # 发送HTTP请求获取网页内容
        response = requests.get(url)
        # 检查请求是否成功
        if response.status_code == 200:
            # 使用BeautifulSoup解析网页内容
            soup = BeautifulSoup(response.text, 'html.parser')
            return soup.prettify()  # 返回格式化后的HTML内容
        else:
            return f"Error: Unable to fetch webpage. Status code: {response.status_code}"
    except Exception as e:
        return f"Error: {str(e)}"

# 示例URL
url = "https://example.com"
# 调用函数并打印结果
print(fetch_webpage(url))

解释说明：

导入库：
- requests：用于发送HTTP请求。
- BeautifulSoup：用于解析HTML内容。
定义函数 fetch_webpage：
- 接受一个URL作为参数。
- 使用requests.get()发送HTTP GET请求。
- 检查响应状态码是否为200（表示请求成功）。
- 使用BeautifulSoup解析HTML内容，并返回格式化后的HTML字符串。
- 如果请求失败或发生异常，返回错误信息。
示例调用：
- 定义一个示例URL（例如https://example.com）。
- 调用fetch_webpage函数并打印结果。

上一篇：python中等待几秒代码

下一篇：python rtsp