Requests

目录


1. 简介

Requests 是 Python 中最流行的 HTTP 库,以其简洁优雅的 API 设计著称。它的口号是:

“HTTP for Humans” —— 让 HTTP 请求变得人性化

相比 Python 内置的 urllib,Requests 提供了更加直观、易用的接口,让你用更少的代码完成更多的功能。

核心优势:

  • 🚀 API 简洁直观,学习成本低
  • 🔧 自动处理编码、JSON 解析等常见问题
  • 🛡️ 内置会话管理、Cookie 持久化
  • ⚡ 支持连接池、国际域名和 URL 编码
  • 📦 支持异步(通过 requests-futures 等扩展)

2. 安装

使用 pip 一键安装:

pip install requests

安装完成后,在 Python 中导入:

import requests

验证安装:

print(requests.__version__)

3. 快速上手

一个最简单的 HTTP 请求只需两行代码:

import requests

response = requests.get('https://api.github.com')
print(response.status_code)  # 200
print(response.json())       # 返回解析后的 JSON 数据

4. 发送 GET 请求

GET 请求用于从服务器获取数据,是最常用的 HTTP 方法。

4.1 基本 GET 请求

response = requests.get('https://api.github.com')

4.2 带参数的 GET 请求

通过 params 参数传递查询字符串:

payload = {
    'q': 'python requests',
    'page': 1,
    'per_page': 10
}

response = requests.get('https://api.github.com/search/repositories', params=payload)

# 等效于请求:
# https://api.github.com/search/repositories?q=python+requests&page=1&per_page=10

print(response.url)  # 查看实际请求的 URL

4.3 带请求头的 GET 请求

headers = {
    'User-Agent': 'MyApp/1.0',
    'Accept': 'application/json'
}

response = requests.get('https://httpbin.org/get', headers=headers)

5. 发送 POST 请求

POST 请求用于向服务器提交数据。

5.1 发送表单数据

data = {
    'username': 'admin',
    'password': '123456'
}

response = requests.post('https://httpbin.org/post', data=data)
print(response.json())

5.2 发送 JSON 数据

使用 json 参数,Requests 会自动设置 Content-Type: application/json 并序列化数据:

import json

payload = {
    'title': 'Hello World',
    'content': 'This is a test post.',
    'tags': ['python', 'requests']
}

response = requests.post('https://httpbin.org/post', json=payload)
print(response.json())

提示json=payload 等同于 data=json.dumps(payload) 加上设置 Content-Type: application/json 请求头。

5.3 发送文件(上传文件)

files = {
    'file': open('report.csv', 'rb')
}

response = requests.post('https://httpbin.org/post', files=files)
print(response.json())

# 记得关闭文件
files['file'].close()

更推荐使用 with 语句自动管理文件:

with open('report.csv', 'rb') as f:
    response = requests.post('https://httpbin.org/post', files={'file': f})

自定义文件名和 MIME 类型:

files = {
    'file': ('custom_name.csv', open('report.csv', 'rb'), 'text/csv')
}

6. 其他 HTTP 方法

Requests 支持所有标准 HTTP 方法,API 风格与 GET/POST 一致:

# PUT 请求 - 更新资源
response = requests.put('https://httpbin.org/put', data={'key': 'value'})

# DELETE 请求 - 删除资源
response = requests.delete('https://httpbin.org/delete')

# PATCH 请求 - 部分更新
response = requests.patch('https://httpbin.org/patch', data={'key': 'value'})

# HEAD 请求 - 只获取响应头
response = requests.head('https://httpbin.org/get')

# OPTIONS 请求 - 获取支持的 HTTP 方法
response = requests.options('https://httpbin.org/get')

7. 传递参数

7.1 URL 查询参数(params

params = {'key1': 'value1', 'key2': 'value2'}
response = requests.get('https://httpbin.org/get', params=params)
# 实际 URL: https://httpbin.org/get?key1=value1&key2=value2

如果参数值是列表,Requests 会自动展开:

params = {'key1': 'value1', 'key2': ['value2a', 'value2b']}
# 实际 URL: https://httpbin.org/get?key1=value1&key2=value2a&key2=value2b

7.2 请求体数据(data

# 表单编码
response = requests.post('https://httpbin.org/post', data={'key': 'value'})

# 原始字符串
response = requests.post('https://httpbin.org/post', data='raw string data')

7.3 JSON 数据(json

response = requests.post('https://httpbin.org/post', json={'key': 'value'})

7.4 多部分编码文件(files

files = {'file': open('test.txt', 'rb')}
response = requests.post('https://httpbin.org/post', files=files)

8. 自定义请求头

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)',
    'Authorization': 'Bearer your_token_here',
    'Accept-Language': 'zh-CN,zh;q=0.9,en;q=0.8'
}

response = requests.get('https://httpbin.org/get', headers=headers)

注意:自定义请求头的优先级低于服务器设置的某些头信息(如 Content-Length)。


9. 响应内容

Requests 提供多种方式读取响应内容:

9.1 文本内容

response = requests.get('https://httpbin.org/html')
print(response.text)  # 自动解码的文本内容

Requests 会根据响应头自动检测编码。你也可以手动指定:

response.encoding = 'utf-8'
print(response.text)

9.2 二进制内容

response = requests.get('https://httpbin.org/image/png')
print(response.content)  # 返回 bytes 对象

# 保存图片
with open('image.png', 'wb') as f:
    f.write(response.content)

9.3 JSON 内容

response = requests.get('https://api.github.com')
data = response.json()  # 自动解析 JSON
print(data['current_user_url'])

如果响应内容不是有效的 JSON,会抛出 requests.exceptions.JSONDecodeError

9.4 原始响应内容

response = requests.get('https://httpbin.org/get', stream=True)
print(response.raw.read(10))  # 读取前 10 个字节

10. 响应状态码

response = requests.get('https://httpbin.org/get')

print(response.status_code)  # 200

# 内置状态码查询
print(response.status_code == requests.codes.ok)           # True
print(response.status_code == requests.codes.not_found)    # False

常用状态码常量

常量 含义
requests.codes.ok 200 请求成功
requests.codes.created 201 资源已创建
requests.codes.bad_request 400 错误请求
requests.codes.unauthorized 401 未授权
requests.codes.forbidden 403 禁止访问
requests.codes.not_found 404 未找到
requests.codes.server_error 500 服务器错误

自动抛出异常

使用 raise_for_status() 在状态码非 2xx 时自动抛出 HTTPError

response = requests.get('https://httpbin.org/status/404')

try:
    response.raise_for_status()
except requests.exceptions.HTTPError as e:
    print(f'请求失败: {e}')

11. 响应头

response = requests.get('https://httpbin.org/get')

# 获取所有响应头(字典类型,不区分大小写)
print(response.headers)

# 获取特定响应头
print(response.headers['Content-Type'])
print(response.headers.get('X-Custom-Header', 'default_value'))

注意response.headers 是一个大小写不敏感的字典。


12. Cookie 处理

12.1 发送 Cookie

cookies = {'session_id': 'abc123'}
response = requests.get('https://httpbin.org/cookies', cookies=cookies)

使用 RequestsCookieJar 对象:

jar = requests.cookies.RequestsCookieJar()
jar.set('cookie1', 'value1', domain='httpbin.org')
jar.set('cookie2', 'value2', domain='httpbin.org')

response = requests.get('https://httpbin.org/cookies', cookies=jar)

12.2 获取响应中的 Cookie

response = requests.get('https://httpbin.org/cookies/set/session/abc123')
print(response.cookies['session'])  # 'abc123'

13. 会话(Session)

Session 对象可以跨请求保持某些参数(如 Cookie、认证信息等),避免重复设置。

13.1 基本用法

# 创建会话
session = requests.Session()

# 设置会话级别的默认值
session.headers.update({
    'User-Agent': 'MyApp/1.0',
    'Authorization': 'Bearer token123'
})

# 所有请求都会携带相同的 Cookie 和请求头
session.get('https://httpbin.org/cookies/set/session/abc123')
response = session.get('https://httpbin.org/cookies')
print(response.json())
# {'cookies': {'session': 'abc123'}}

# 使用完毕后关闭会话
session.close()

13.2 使用上下文管理器

with requests.Session() as session:
    session.get('https://httpbin.org/cookies/set/session/abc123')
    response = session.get('https://httpbin.org/cookies')
    print(response.json())

13.3 会话的优势

  • Cookie 持久化:自动在多个请求间保持 Cookie
  • 连接池:复用底层 TCP 连接,提高性能
  • 默认配置:统一设置请求头、认证等参数

14. 超时设置

为防止请求长时间挂起,务必设置超时:

# 连接超时 5 秒,读取超时 10 秒
response = requests.get('https://httpbin.org/delay/2', timeout=(5, 10))

# 同时设置连接和读取超时
response = requests.get('https://httpbin.org/delay/2', timeout=5)

最佳实践:生产环境中,几乎所有请求都应设置超时。


15. 异常处理

Requests 定义了多种异常类型,建议在代码中妥善处理:

import requests
from requests.exceptions import (
    RequestException,      # 所有 Requests 异常的基类
    ConnectionError,       # 网络连接错误
    Timeout,               # 请求超时
    HTTPError,             # HTTP 状态码错误(4xx, 5xx)
    URLRequired,           # 缺少 URL
    TooManyRedirects,      # 重定向过多
)

url = 'https://httpbin.org/get'

try:
    response = requests.get(url, timeout=5)
    response.raise_for_status()
    data = response.json()
    print('请求成功!')

except ConnectionError:
    print('网络连接失败,请检查网络设置。')
except Timeout:
    print('请求超时,服务器响应过慢。')
except HTTPError as e:
    print(f'HTTP 错误: {e}')
except RequestException as e:
    print(f'请求异常: {e}')

16. 身份认证

16.1 基本认证(Basic Auth)

from requests.auth import HTTPBasicAuth

response = requests.get('https://httpbin.org/basic-auth/user/passwd',
                         auth=HTTPBasicAuth('user', 'passwd'))

# 简写形式
response = requests.get('https://httpbin.org/basic-auth/user/passwd',
                         auth=('user', 'passwd'))

16.2 摘要认证(Digest Auth)

from requests.auth import HTTPDigestAuth

response = requests.get('https://httpbin.org/digest-auth/auth/user/passwd',
                         auth=HTTPDigestAuth('user', 'passwd'))

16.3 Bearer Token 认证

headers = {'Authorization': 'Bearer your_token_here'}
response = requests.get('https://api.example.com/protected', headers=headers)

16.4 OAuth 认证

需要安装 requests-oauthlib

pip install requests-oauthlib
from requests_oauthlib import OAuth1

auth = OAuth1('consumer_key', 'consumer_secret',
              'oauth_token', 'oauth_token_secret')
response = requests.get('https://api.example.com/protected', auth=auth)

17. 代理设置

17.1 基本代理

proxies = {
    'http': 'http://proxy.example.com:8080',
    'https': 'https://proxy.example.com:8080'
}

response = requests.get('https://httpbin.org/ip', proxies=proxies)

17.2 带认证的代理

proxies = {
    'http': 'http://user:password@proxy.example.com:8080',
    'https': 'https://user:password@proxy.example.com:8080'
}

response = requests.get('https://httpbin.org/ip', proxies=proxies)

17.3 SOCKS 代理

需要安装 requests[socks]

pip install requests[socks]
proxies = {
    'http': 'socks5://proxy.example.com:1080',
    'https': 'socks5://proxy.example.com:1080'
}

response = requests.get('https://httpbin.org/ip', proxies=proxies)

18. 高级用法

18.1 重定向控制

# 允许重定向(默认)
response = requests.get('https://httpbin.org/redirect/1')

# 禁止重定向
response = requests.get('https://httpbin.org/redirect/1', allow_redirects=False)
print(response.status_code)  # 302
print(response.headers['Location'])  # 重定向目标 URL

18.2 SSL 证书验证

# 跳过 SSL 验证(不推荐,仅用于测试)
response = requests.get('https://example.com', verify=False)

# 指定 CA 证书
response = requests.get('https://example.com', verify='/path/to/certfile')

# 客户端证书
response = requests.get('https://example.com',
                         cert=('/path/client.cert', '/path/client.key'))

18.3 流式下载大文件

对于大文件,使用流式传输避免内存溢出:

response = requests.get('https://example.com/large-file.zip', stream=True)

with open('large-file.zip', 'wb') as f:
    for chunk in response.iter_content(chunk_size=8192):
        f.write(chunk)

18.4 事件钩子

def print_response(response, *args, **kwargs):
    print(f'请求 URL: {response.url}')
    print(f'状态码: {response.status_code}')

response = requests.get('https://httpbin.org/get', hooks={'response': print_response})

18.5 预请求(Prepared Request)

当你需要对请求进行精细控制时:

from requests import Request, Session

session = Session()

req = Request('GET', 'https://httpbin.org/get',
              params={'key': 'value'},
              headers={'User-Agent': 'MyApp/1.0'})

prepared = session.prepare_request(req)
response = session.send(prepared)
print(response.json())

19. 最佳实践

✅ 推荐做法

# 1. 使用 Session 复用连接
with requests.Session() as session:
    session.headers.update({'User-Agent': 'MyApp/1.0'})
    response = session.get('https://api.example.com/data')

# 2. 始终设置超时
response = requests.get('https://api.example.com', timeout=10)

# 3. 妥善处理异常
try:
    response = requests.get(url, timeout=10)
    response.raise_for_status()
except requests.RequestException as e:
    print(f'请求失败: {e}')

# 4. 大文件使用流式下载
with requests.get(url, stream=True) as response:
    response.raise_for_status()
    with open('file.zip', 'wb') as f:
        for chunk in response.iter_content(chunk_size=8192):
            f.write(chunk)

❌ 避免的做法

# 1. 不要忽略超时设置
response = requests.get(url)  # ❌ 可能无限等待

# 2. 不要在生产环境跳过 SSL 验证
response = requests.get(url, verify=False)  # ❌ 安全风险

# 3. 不要忽略异常处理
response = requests.get(url)  # ❌ 未处理可能的异常
data = response.json()        # ❌ 可能抛出 JSONDecodeError

20. 总结

功能 方法/参数
GET 请求 requests.get(url, params, headers)
POST 请求 requests.post(url, data, json, files)
PUT 请求 requests.put(url, data)
DELETE 请求 requests.delete(url)
响应文本 response.text
响应 JSON response.json()
响应二进制 response.content
状态码 response.status_code
响应头 response.headers
Cookie cookies 参数 / response.cookies
超时 timeout 参数
认证 auth 参数
代理 proxies 参数
会话 requests.Session()
异常处理 try/except + raise_for_status()

Requests 是 Python HTTP 请求的瑞士军刀,掌握本教程中的内容,足以应对绝大多数网络请求场景。更多高级用法请参考 官方文档


参考资源

更多推荐