Requests
requests模块是python中帮我们模拟http请求的基本模块,我们通过python代码来完成http 所有的请求
我们这篇文章中会从最基本的使用方法开始,逐步介绍如下内容
- 如何使用request模块完成普通的http 请求
- 如何修改HTTP请求的 headers 和 data模块
- 分析HTTP的request及response
- requestsrequest中如何完成认证
在开始这篇文章之前,你需要有基本的http常识,不需要精通,但是基本的知识还是需要的
Hello world
首先,我们先来安装一下模块,我们可以通过如下命令安装requests模块:
1 |
$ pip install requests |
安装成功以后,我们就可以在python脚本中通过如下方式导入request模块:
1 |
import requests |
导入成功后我们来看一下HTTP中最基本的Get 方法
The GET
HTTP 方法分 GET 和POST(常见),不同的方法取决于我们想干什么,这里我们仅对常用的GET 和POST 方法进行介绍
GET方法是你希望从服务器端GET一些数据data,如果我们想使用request模块来发送http GET 方法,我们只需要一条简单的命令:
1 |
requests.get() |
To test this out, you can make a GET
request to GitHub’s Root REST API by calling get()
with the following URL:
我们可以通过如下命令进行测试:
1 2 3 4 5 6 7 8 |
$ python3 Python 3.7.3 (default, Nov 15 2019, 04:04:52) [Clang 11.0.0 (clang-1100.0.33.16)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import requests >>> requests.get('https://api.github.com') <Response [200]> >>> |
1 |
<span style="font-family: Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif; font-size: 16px; background-color: #ffffff;">恭喜你完成了你第一个请求.</span> |
The Response
Response是一个帮我们来解析我们执行的HTTP请求的返回结果的工具,还是上边的例子,我们这次将返回结果存储到一个response中,看看我们的get请求都具体返回了哪些东西
1 |
>>> response = requests.get('https://api.github.com') |
Status Codes
第一个我们我们从response中获取的就是HTTP 状态码,这个状态码表明了刚刚你的请求是成功了还是失败了
举个例子: 如果状态码是200表明你刚刚的请求成功了,如果是404表明你请求的资源不存在,如果你没注意到的话,我的域名503error 表明了Service Unavailable
我们可以通过.status_code来获取刚刚的请求的状态码:
1 2 |
>>> response.status_code 200 |
有时候我们要通过状态码做一些判断,例如:
1 2 3 4 |
if response.status_code == 200: print('Success!') elif response.status_code == 404: print('Not Found.') |
再例如如果请求成功就打印success,如果不成功就打印error:
1 2 3 4 |
if response: print('Success!') else: print('An error has occurred.') |
具体的例子:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
import requests from requests.exceptions import HTTPError for url in ['https://api.github.com', 'https://api.github.com/invalid']: try: response = requests.get(url) # If the response was successful, no Exception will be raised response.raise_for_status() except HTTPError as http_err: print(f'HTTP error occurred: {http_err}') # Python 3.6 except Exception as err: print(f'Other error occurred: {err}') # Python 3.6 else: print('Success!') |
Content
在一个Get请求得到的response中,我们称其中有价值的信息为payload,我们可以通过不同的方法查看不同格式的payload
使用.content:
1 2 3 |
>>> response = requests.get('https://api.github.com') >>> response.content b'{"current_user_url":"https://api.github.com/user","current_user_authorizations_html_url":"https://github.com/settings/connections/applications{/client_id}","authorizations_url":"https://api.github.com/authorizations","code_search_url":"https://api.github.com/search/code?q={query}{&page,per_page,sort,order}","commit_search_url":"https://api.github.com/search/commits?q={query}{&page,per_page,sort,order}","emails_url":"https://api.github.com/user/emails","emojis_url":"https://api.github.com/emojis","events_url":"https://api.github.com/events","feeds_url":"https://api.github.com/feeds","followers_url":"https://api.github.com/user/followers","following_url":"https://api.github.com/user/following{/target}","gists_url":"https://api.github.com/gists{/gist_id}","hub_url":"https://api.github.com/hub","issue_search_url":"https://api.github.com/search/issues?q={query}{&page,per_page,sort,order}","issues_url":"https://api.github.com/issues","keys_url":"https://api.github.com/user/keys","notifications_url":"https://api.github.com/notifications","organization_repositories_url":"https://api.github.com/orgs/{org}/repos{?type,page,per_page,sort}","organization_url":"https://api.github.com/orgs/{org}","public_gists_url":"https://api.github.com/gists/public","rate_limit_url":"https://api.github.com/rate_limit","repository_url":"https://api.github.com/repos/{owner}/{repo}","repository_search_url":"https://api.github.com/search/repositories?q={query}{&page,per_page,sort,order}","current_user_repositories_url":"https://api.github.com/user/repos{?type,page,per_page,sort}","starred_url":"https://api.github.com/user/starred{/owner}{/repo}","starred_gists_url":"https://api.github.com/gists/starred","team_url":"https://api.github.com/teams","user_url":"https://api.github.com/users/{user}","user_organizations_url":"https://api.github.com/user/orgs","user_repositories_url":"https://api.github.com/users/{user}/repos{?type,page,per_page,sort}","user_search_url":"https://api.github.com/search/users?q={query}{&page,per_page,sort,order}"}' |
.content给我的格式可能不是我们想要的(byte),我们最常用的需求就是转换成字符串string , 所以,我们更推荐.text
1 2 |
>>> response.text '{"current_user_url":"https://api.github.com/user","current_user_authorizations_html_url":"https://github.com/settings/connections/applications{/client_id}","authorizations_url":"https://api.github.com/authorizations","code_search_url":"https://api.github.com/search/code?q={query}{&page,per_page,sort,order}","commit_search_url":"https://api.github.com/search/commits?q={query}{&page,per_page,sort,order}","emails_url":"https://api.github.com/user/emails","emojis_url":"https://api.github.com/emojis","events_url":"https://api.github.com/events","feeds_url":"https://api.github.com/feeds","followers_url":"https://api.github.com/user/followers","following_url":"https://api.github.com/user/following{/target}","gists_url":"https://api.github.com/gists{/gist_id}","hub_url":"https://api.github.com/hub","issue_search_url":"https://api.github.com/search/issues?q={query}{&page,per_page,sort,order}","issues_url":"https://api.github.com/issues","keys_url":"https://api.github.com/user/keys","notifications_url":"https://api.github.com/notifications","organization_repositories_url":"https://api.github.com/orgs/{org}/repos{?type,page,per_page,sort}","organization_url":"https://api.github.com/orgs/{org}","public_gists_url":"https://api.github.com/gists/public","rate_limit_url":"https://api.github.com/rate_limit","repository_url":"https://api.github.com/repos/{owner}/{repo}","repository_search_url":"https://api.github.com/search/repositories?q={query}{&page,per_page,sort,order}","current_user_repositories_url":"https://api.github.com/user/repos{?type,page,per_page,sort}","starred_url":"https://api.github.com/user/starred{/owner}{/repo}","starred_gists_url":"https://api.github.com/gists/starred","team_url":"https://api.github.com/teams","user_url":"https://api.github.com/users/{user}","user_organizations_url":"https://api.github.com/user/orgs","user_repositories_url":"https://api.github.com/users/{user}/repos{?type,page,per_page,sort}","user_search_url":"https://api.github.com/search/users?q={query}{&page,per_page,sort,order}"}' |
因为从byte格式解码成字符串string,需要一个编码格式,如果我们不指定,那么request模块会根据response中的headers中来猜测编码格式,当然我们也可以手动指定我们想要的编码格式:
1 2 3 |
>>> response.encoding = 'utf-8' # Optional: requests infers this internally >>> response.text '{"current_user_url":"https://api.github.com/user","current_user_authorizations_html_url":"https://github.com/settings/connections/applications{/client_id}","authorizations_url":"https://api.github.com/authorizations","code_search_url":"https://api.github.com/search/code?q={query}{&page,per_page,sort,order}","commit_search_url":"https://api.github.com/search/commits?q={query}{&page,per_page,sort,order}","emails_url":"https://api.github.com/user/emails","emojis_url":"https://api.github.com/emojis","events_url":"https://api.github.com/events","feeds_url":"https://api.github.com/feeds","followers_url":"https://api.github.com/user/followers","following_url":"https://api.github.com/user/following{/target}","gists_url":"https://api.github.com/gists{/gist_id}","hub_url":"https://api.github.com/hub","issue_search_url":"https://api.github.com/search/issues?q={query}{&page,per_page,sort,order}","issues_url":"https://api.github.com/issues","keys_url":"https://api.github.com/user/keys","notifications_url":"https://api.github.com/notifications","organization_repositories_url":"https://api.github.com/orgs/{org}/repos{?type,page,per_page,sort}","organization_url":"https://api.github.com/orgs/{org}","public_gists_url":"https://api.github.com/gists/public","rate_limit_url":"https://api.github.com/rate_limit","repository_url":"https://api.github.com/repos/{owner}/{repo}","repository_search_url":"https://api.github.com/search/repositories?q={query}{&page,per_page,sort,order}","current_user_repositories_url":"https://api.github.com/user/repos{?type,page,per_page,sort}","starred_url":"https://api.github.com/user/starred{/owner}{/repo}","starred_gists_url":"https://api.github.com/gists/starred","team_url":"https://api.github.com/teams","user_url":"https://api.github.com/users/{user}","user_organizations_url":"https://api.github.com/user/orgs","user_repositories_url":"https://api.github.com/users/{user}/repos{?type,page,per_page,sort}","user_search_url":"https://api.github.com/search/users?q={query}{&page,per_page,sort,order}"}' |
如果你细心的话会发现,text的内容其实是一段JSON代码,我们可以通过简单的.json方法来把内容转换成json,方便我们对数据进行处理:
1 2 |
>>> response.json() {'current_user_url': 'https://api.github.com/user', 'current_user_authorizations_html_url': 'https://github.com/settings/connections/applications{/client_id}', 'authorizations_url': 'https://api.github.com/authorizations', 'code_search_url': 'https://api.github.com/search/code?q={query}{&page,per_page,sort,order}', 'commit_search_url': 'https://api.github.com/search/commits?q={query}{&page,per_page,sort,order}', 'emails_url': 'https://api.github.com/user/emails', 'emojis_url': 'https://api.github.com/emojis', 'events_url': 'https://api.github.com/events', 'feeds_url': 'https://api.github.com/feeds', 'followers_url': 'https://api.github.com/user/followers', 'following_url': 'https://api.github.com/user/following{/target}', 'gists_url': 'https://api.github.com/gists{/gist_id}', 'hub_url': 'https://api.github.com/hub', 'issue_search_url': 'https://api.github.com/search/issues?q={query}{&page,per_page,sort,order}', 'issues_url': 'https://api.github.com/issues', 'keys_url': 'https://api.github.com/user/keys', 'notifications_url': 'https://api.github.com/notifications', 'organization_repositories_url': 'https://api.github.com/orgs/{org}/repos{?type,page,per_page,sort}', 'organization_url': 'https://api.github.com/orgs/{org}', 'public_gists_url': 'https://api.github.com/gists/public', 'rate_limit_url': 'https://api.github.com/rate_limit', 'repository_url': 'https://api.github.com/repos/{owner}/{repo}', 'repository_search_url': 'https://api.github.com/search/repositories?q={query}{&page,per_page,sort,order}', 'current_user_repositories_url': 'https://api.github.com/user/repos{?type,page,per_page,sort}', 'starred_url': 'https://api.github.com/user/starred{/owner}{/repo}', 'starred_gists_url': 'https://api.github.com/gists/starred', 'team_url': 'https://api.github.com/teams', 'user_url': 'https://api.github.com/users/{user}', 'user_organizations_url': 'https://api.github.com/user/orgs', 'user_repositories_url': 'https://api.github.com/users/{user}/repos{?type,page,per_page,sort}', 'user_search_url': 'https://api.github.com/search/users?q={query}{&page,per_page,sort,order}'} |
通过.json()方法处理过后,我们的得到的是一个字典类型的数据,然后我们就可以自由的操作类型了
Headers
返回的数据中的headers部分可以给我们很多的有用的信息,比如内容的类型(上边说的utf-8),还有一些其他的一些limit等,如果我们想查看这些信息,我们可以通过.headers来获取这些信息
1 2 |
>>> response.headers {'Server': 'GitHub.com', 'Date': 'Mon, 10 Dec 2018 17:49:54 GMT', 'Content-Type': 'application/json; charset=utf-8', 'Transfer-Encoding': 'chunked', 'Status': '200 OK', 'X-RateLimit-Limit': '60', 'X-RateLimit-Remaining': '59', 'X-RateLimit-Reset': '1544467794', 'Cache-Control': 'public, max-age=60, s-maxage=60', 'Vary': 'Accept', 'ETag': 'W/"7dc470913f1fe9bb6c7355b50a0737bc"', 'X-GitHub-Media-Type': 'github.v3; format=json', 'Access-Control-Expose-Headers': 'ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type', 'Access-Control-Allow-Origin': '*', 'Strict-Transport-Security': 'max-age=31536000; includeSubdomains; preload', 'X-Frame-Options': 'deny', 'X-Content-Type-Options': 'nosniff', 'X-XSS-Protection': '1; mode=block', 'Referrer-Policy': 'origin-when-cross-origin, strict-origin-when-cross-origin', 'Content-Security-Policy': "default-src 'none'", 'Content-Encoding': 'gzip', 'X-GitHub-Request-Id': 'E439:4581:CF2351:1CA3E06:5C0EA741'} |
.headers返回的是一个类字典的格式,允许你通过key来访问具体的值,例如:
1 2 |
>>> response.headers['Content-Type'] 'application/json; charset=utf-8' |
这里有一些特殊的对象,HTTP的 headers 不是大小写敏感的,也就是说我们不需要担心headers中的大小写的事情:
1 2 |
>>> response.headers['content-type'] 'application/json; charset=utf-8' |
Whether you use the key 'content-type'
or 'Content-Type'
, you’ll get the same value.
如上边的例子,我们可以使用Content-Type 也可以使用 content-type,得到的结果是一样的
Query String Parameters
一种常见的定制化GET 请求的方式是通过在请求的url后边追加参数的方式,我们将数据传递给他Params例如:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
import requests # Search GitHub's repositories for requests response = requests.get( 'https://api.github.com/search/repositories', params={'q': 'requests+language:python'}, ) # Inspect some attributes of the `requests` repository json_response = response.json() repository = json_response['items'][0] print(f'Repository name: {repository["name"]}') # Python 3.6+ print(f'Repository description: {repository["description"]}') # Python 3.6+ |
上边的例子基本等于:
1 |
curl https://api.github.com/search/repositories?q=tetris+language:python |
Request Headers
我们可以定做我们的Get 方法的headers,我们可以传一个字典给request,例如:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
import requests response = requests.get( 'https://api.github.com/search/repositories', params={'q': 'requests+language:python'}, headers={'Accept': 'application/vnd.github.v3.text-match+json'}, ) # View the new `text-matches` array which provides information # about your search term within the results json_response = response.json() repository = json_response['items'][0] print(f'Text matches: {repository["text_matches"]}') |
Other HTTP Methods
除了GET方法,常见的还有POST , PUT ,DELETE,PATCH 等,例如:
1 2 3 4 5 6 |
>>> requests.post('https://httpbin.org/post', data={'key':'value'}) >>> requests.put('https://httpbin.org/put', data={'key':'value'}) >>> requests.delete('https://httpbin.org/delete') >>> requests.head('https://httpbin.org/get') >>> requests.patch('https://httpbin.org/patch', data={'key':'value'}) >>> requests.options('https://httpbin.org/get') |
每个方法我们都可以通过上边的方法来分析这些方法的response,例如:
1 2 3 4 5 6 7 8 |
>>> response = requests.head('https://httpbin.org/get') >>> response.headers['Content-Type'] 'application/json' >>> response = requests.delete('https://httpbin.org/delete') >>> json_response = response.json() >>> json_response['args'] {} |
下边我们详细说一下POST , PUT 还有PATCH
The Message Body
根据HTTP的定义,POST, PUT,还有不怎么常见的PATCH 要求数据通过信息的body部分一起传递,而不可以使用上边的query stirng的方式(url后边加参数的形式),我们需要将所有的信息都放入data 中
data 可以介绍字典,元组等多种数据类型,但是基本就是服务器要求什么我们就给什么,例如:
如果我们的content type是 application/x-www-form-urlencoded
,那么我就就可以发送一个字典
1 2 |
>>> requests.post('https://httpbin.org/post', data={'key':'value'}) <Response [200]> |
另外一种方式:
1 2 |
>>> requests.post('https://httpbin.org/post', data=[('key', 'value')]) <Response [200]> |
如果你需要发送json格式的数据,你可以直接使用json 参数,当我们使用json的时候,request会序列化我们的数据,然后添加正确的content-type header
我们可以通过如下的方法进行测试,httpbin.org貌似是requests 模块的作者开发的:
1 2 3 4 5 6 |
>>> response = requests.post('https://httpbin.org/post', json={'key':'value'}) >>> json_response = response.json() >>> json_response['data'] '{"key": "value"}' >>> json_response['headers']['Content-Type'] 'application/json' |
Inspecting Your Request
当我们定制我们的request的时候, requests模块会帮我们准备所有的数据的发送给server之前(包括验证header,序列化json等),我们可以通过如下是方式进行查看.request
1 2 3 4 5 6 7 |
>>> response = requests.post('https://httpbin.org/post', json={'key':'value'}) >>> response.request.headers['Content-Type'] 'application/json' >>> response.request.url 'https://httpbin.org/post' >>> response.request.body b'{"key": "value"}' |
Inspecting the PreparedRequest
gives you access to all kinds of information about the request being made such as payload, URL, headers, authentication, and more.
通过查看这些PreparedRequest的真实内容,我们可以查看各种信息
So far, you’ve made a lot of different kinds of requests, but they’ve all had one thing in common: they’re unauthenticated requests to public APIs. Many services you may come across will want you to authenticate in some way.
到现在为止,我们已经了解好多种请求的类型,但是这些都是不需要认证的请求,其实更多的时候我们是需要先认证然后进行请求
Authentication
认证能够帮助服务器知道我们是谁,典型的认证都是我们通过Authorization header来讲我们的认证信息发送给服务器,当然也可以是自定义的header
一个例子就是GITHUB的 Authenticated User API, 如果我们想访问某些信息,我们需要将我们的用户名密码信息一并传送以方便认证:
1 2 3 |
>>> from getpass import getpass >>> requests.get('https://api.github.com/user', auth=('username', getpass())) <Response [200]> |
如果我们的账号密码被服务器端认证通过,我们就会得到状态码200,如果没有通过,我们就会收到验证码401:
1 2 |
>>> requests.get('https://api.github.com/user') <Response [401]> |
我们也可以铜鼓auth 参数将我们的用户名密码等认证信息传递,所以我们可以通过如下方式进行同样的请求:
1 2 3 4 5 6 7 |
>>> from requests.auth import HTTPBasicAuth >>> from getpass import getpass >>> requests.get( ... 'https://api.github.com/user', ... auth=HTTPBasicAuth('username', getpass()) ... ) <Response [200]> |
SSL Certificate Verification
SSL 认证,每当我们获取敏感信息的时候,HTTPS就是一个更好的方式,也就是说,我们需要认证服务器端的证书
很多时候我们也可以不验证(虽然不推荐,但是大部分人都不会验证)
1 2 3 4 |
>>> requests.get('https://api.github.com', verify=False) InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings InsecureRequestWarning) <Response [200]> |
The Session Object
Session是什么呢?其实就是如果我们想访问10个需要认证的地址,如果每次都认证一次,很麻烦,我们希望认证一次,然后保持这个认证通过的状态,我访问其他9个链接的时候就不会出现401(认证失败)的情况了
下边看一个例子:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
import requests from getpass import getpass # By using a context manager, you can ensure the resources used by # the session will be released after use with requests.Session() as session: session.auth = ('username', getpass()) # Instead of requests.get(), you'll use session.get() response = session.get('https://api.github.com/user') # You can inspect the response just like you did before print(response.headers) print(response.json()) |
每当你发起一个带有session的请求,当他被初始化以后,我们的认证信息就会被保留
参考地址;https://realpython.com/python-requests/
Latest posts by Zhiming Zhang (see all)
- aws eks node 自动化扩展工具 Karpenter - 8月 10, 2022
- ReplicationController and ReplicaSet in Kubernetes - 12月 20, 2021
- public key fingerprint - 5月 27, 2021