The Internet is an enormous source of data and, often, websites will offer a RESTful API endpoints (URLs, URIs) to share data via HTTP requests. HTTP requests are composed of methods like GET, POST, PUT, DELETE, etc. to manipulate and access resources or data. Often, websites require a registration process to access RESTful APIs or offer no API at all. So, to simplify the process, we can also download the data as raw text and format it. For instance, downloading content from a personal blog or profile information of a GitHub user without any registration. This guide will explain the process of making web requests in python using Requests
package and its various features.
1pip3 install requests
Use
pip
for python 2 (until python 3.4). Python also offers Virtualenv to manage the dependencies and development environments separately, across multiple applications.
In order to make a REST call, the first step is to import the python requests
module in the current environment.
1import requests # To use request package in current program
2response = requests.get("www.dummyurl.com") # To execute get request
Python also provides a way to create alliances using the
as
keyword.
1import requests as reqs
2response = reqs.get('https://www.google.com')
To make the first request, we will be using JSONPlaceholder API which provides JSON response for specific item like posts, todos, and albums. So, the /todos/1
API will respond with the details of a TODO item.
1url = 'https://jsonplaceholder.typicode.com/todos/1'
2response = requests.get(url) # To execute get request
3print(response.status_code) # To print http response code
4print(response.text) # To print formatted JSON response
The execution of above snippet will provide the result:
1200
2{
3 "userId": 1,
4 "id": 1,
5 "title": "delectus aut autem",
6 "completed": false
7}
The status code 200
means a successful execution of request and response.content
will return the actual JSON response of a TODO item.
There are many public APIs available to test REST calls. You can also use Postman Echo or mocky to return customized responses and headers as well as adding a delay to the generated dummy link.
Post requests are more secure because they can carry data in an encrypted form as a message body. Whereas GET requests append the parameters in the URL, which is also visible in the browser history, SSL/TLS and HTTPS connections encrypt the GET parameters as well. If you are not using HTTPs or SSL/TSL connections, then POST requests are the preference for security.
A dictionary object can be used to send the data, as a key-value pair, as a second parameter to the post
method.
1data = {'title':'Python Requests','body':'Requests are awesome','userId':1}
2response = requests.post('https://jsonplaceholder.typicode.com/posts', data)
3print(response.status_code)
4print(response.text)
This dummy post request will return the attached data
as response body:
1201
2{
3 "title": "Python Requests",
4 "body": "Requests are awesome",
5 "userId": "1",
6 "id": 101
7}
POST requests have no restriction on data length, so they’re more suitable for files and images. Whereas GET requests have a limit of 2 kilobytes (some servers can handle 64 KB data) and GET only allows ASCII values.
Just like
post
,requests
also support other methods likeput
,delete
, etc. Any request can be sent without any data and can define empty placeholder names to enhance code clarity.
1response = req.post('https://jsonplaceholder.typicode.com/posts', data = None, json = dictionaryObject)
2print(response.json()) # output: {'id': 101}
In this case where data
is set as None
, this can be skipped because it happened automatically due to
default values.
The response object can be parsed as string, bytes, JSON, or raw as:
1print(response.content) # To print response bytes
2print(response.text) # To print unicode response string
3jsonRes = response.json() # To get response dictionary as JSON
4print(jsonRes['title'] , jsonRes['body'], sep = ' : ') # output: Python Requests : Requests are awesome
Reading the response as a raw value allows us to read specific number of bytes and to enable this, set
stream = True
as a parameter in the request method.
1data = {'title':'Pyton Requests','body':'Requests are qwesome','userId':1}
2response = req.post('https://jsonplaceholder.typicode.com/posts', data, stream = True)
3print(response.raw.read(30)) # output: b'{\n "title": "Python Requests"'
To enable stream, the
stream
placeholder has to be mentioned specifically because it is not a required argument.
You can also use iter_content
method which automatically decodes gzip
files.
1response.iter_content(chunk_size=1024)
The process of authentication is required by many APIs to allow access to user specific details. Requests support various types of authentication, such as:
base64
encoding (text as bytes), meaning there is no encryption and security. It is suitable for HTTPs or SSL/TSL enabled connections where security is inbuilt. 1# Open github API to test authentication
2from requests.auth import HTTPBasicAuth
3requests.get('https://api.github.com/user', auth=HTTPBasicAuth('userName', 'password'))
4
5# or shortcut method
6requests.get('https://api.github.com/user', auth=('user', 'pass'))
1from requests.auth import HTTPDigestAuth
2response = reqs.get('https://postman-echo.com/digest-auth', auth=HTTPDigestAuth('postman',
3'password'))
Digest Auth can still be hacked and HTTPs or SSL/TSL security should be preferred over digest authentication.
A header contains information about the client (type of browser), server, accepted response type, IP address, etc. Headers can be customized for the source browser (user-agent) and content-type. They can be viewed using headers
property as:
1headers = {'user-agent': 'customize header string', 'Content-Type': 'application/json; charset=utf-8'}
2response = requests.get(url, headers=headers) # modify request headers
3print(response.headers) # print response headers
4print(response.headers['Content-Type']) # output: application/json; charset=utf-8
requests
to terminate any request, if there is no response within the set timeout duration. This will avoid any indefinite waiting state, in case there's no response from server. 1requests.get('https://github.com/', timeout=0.50)
url
property to track redirected URLs. 1response = requests.get('http://github.com/', allow_redirects=True)
2response.url
To disable redirection, set the
allow_redirects
parameter toFalse
. By default it is set toTrue
.
Sending sensitive data, such as password, over GET requests with HTTPs or SSL/TSL is considered very poor practice. While it cannot be intercepted, the data would be logged in serverlogs as plain text on the receiving HTTPS server and quite possibly also in browser history. It is probably also available to browser plugins and, possibly, other applications on the client computer.
Lists of other supported parameters like proxies, cert, and verify are supported by Requests.
1try:
2 response = requests.get(url,timeout=3)
3 response.raise_for_status() # Raise error in case of failure
4except requests.exceptions.HTTPError as httpErr:
5 print ("Http Error:",httpErr)
6except requests.exceptions.ConnectionError as connErr:
7 print ("Error Connecting:",connErr)
8except requests.exceptions.Timeout as timeOutErr:
9 print ("Timeout Error:",timeOutErr)
10except requests.exceptions.RequestException as reqErr:
11 print ("Something Else:",reqErr)