Using ETags for Efficient Data Transfer in aiohttp
ETags are unique identifiers assigned to specific versions of a resource. They help determine if the content has changed so you can avoid downloading unchanged data.
In this tutorial, you’ll learn how to implement ETag support in aiohttp servers and clients to reduce unnecessary data transfer.
You’ll also explore advanced methods like optimistic locking and concurrency control using ETags.
Generate ETags
You can generate ETags using various methods, such as content hashing or timestamp-based approaches:
import time import hashlib def generate_etag(content): return hashlib.md5(content.encode()).hexdigest() def generate_timestamp_etag(): return str(int(time.time())) content = "Hello, World!" print(f"Content-based ETag: {generate_etag(content)}") print(f"Timestamp-based ETag: {generate_timestamp_etag()}")
Output:
Content-based ETag: 65a8e27d8879283831b664bd8b7f0ad4 Timestamp-based ETag: 1726045905
The content-based ETag remains consistent for the same content, while the timestamp-based ETag changes with each generation.
Implement ETag Support in aiohttp Server
To generate ETags for aiohttp responses, you can create a middleware:
from aiohttp import web import hashlib @web.middleware async def etag_middleware(request, handler): response = await handler(request) if response.body is not None: etag = hashlib.md5(response.body).hexdigest() response.headers['ETag'] = etag return response app = web.Application(middlewares=[etag_middleware]) async def hello(request): return web.Response(text="Hello, world!") app.router.add_get('/', hello) if __name__ == '__main__': web.run_app(app)
This middleware automatically generates and sets ETags for all responses with a body.
If you check the response header, you’ll see the ETag value.
Handle If-None-Match requests
To handle If-None-Match requests, you can check if the client’s If-None-Match header matches the current ETag and returns a 304 Not Modified response if they match:
async def handle_request(request): content = "Hello, World!" etag = hashlib.md5(content.encode()).hexdigest() if request.headers.get('If-None-Match') == etag: return web.Response(status=304) return web.Response(text=content, headers={'ETag': etag}) app.router.add_get('/', handle_request)
ETag-aware Caching in Aiohttp Clients
To implement ETag-aware caching in aiohttp clients, you can use a dictionary to store ETags:
import aiohttp import asyncio etag_cache = {} async def fetch_with_etag(url): async with aiohttp.ClientSession() as session: headers = {'If-None-Match': etag_cache.get(url)} if url in etag_cache else {} async with session.get(url, headers=headers) as response: if response.status == 304: print(f"Resource not modified: {url}") return None etag_cache[url] = response.headers.get('ETag') content = await response.text() print(f"Received content: {content}") return content async def main(): # First request await fetch_with_etag('http://127.0.0.1:8080') # Second request (should be cached) await fetch_with_etag('http://127.0.0.1:8080') asyncio.run(main())
Output:
Received content: Hello, World! Resource not modified: http://127.0.0.1:8080
This function stores ETags in a dictionary and uses them for subsequent requests.
Update Local Cache Based on ETag Changes
To update the local cache based on ETag changes:
import json import asyncio import aiohttp async def fetch_and_update_cache(url, cache): async with aiohttp.ClientSession() as session: headers = {'If-None-Match': cache[url]['etag']} if url in cache else {} async with session.get(url, headers=headers) as response: if response.status == 304: print("Cache is up to date") return cache[url]['data'] data = await response.json() cache[url] = {'etag': response.headers.get('ETag'), 'data': data} print("Cache updated") return data cache = {} result = asyncio.run(fetch_and_update_cache('https://api.github.com/users/octocat', cache)) print(json.dumps(result, indent=2))
Output:
Cache updated { "login": "octocat", "id": 583231, "node_id": "MDQ6VXNlcjU4MzIzMQ==", "avatar_url": "https://avatars.githubusercontent.com/u/583231?v=4", "gravatar_id": "", "url": "https://api.github.com/users/octocat", "html_url": "https://github.com/octocat", "followers_url": "https://api.github.com/users/octocat/followers", "following_url": "https://api.github.com/users/octocat/following{/other_user}", "gists_url": "https://api.github.com/users/octocat/gists{/gist_id}", "starred_url": "https://api.github.com/users/octocat/starred{/owner}{/repo}", "subscriptions_url": "https://api.github.com/users/octocat/subscriptions", "organizations_url": "https://api.github.com/users/octocat/orgs", "repos_url": "https://api.github.com/users/octocat/repos", "events_url": "https://api.github.com/users/octocat/events{/privacy}", "received_events_url": "https://api.github.com/users/octocat/received_events", "type": "User", "site_admin": false, "name": "The Octocat", "company": "@github", "blog": "https://github.blog", "location": "San Francisco", "email": null, "hireable": null, "bio": null, "twitter_username": null, "public_repos": 8, "public_gists": 8, "followers": 14882, "following": 9, "created_at": "2011-01-25T18:44:36Z", "updated_at": "2024-08-22T11:25:04Z" } C:\Users\Mokhtar\Python Projects>python client.py Cache updated { "login": "octocat", "id": 583231, "node_id": "MDQ6VXNlcjU4MzIzMQ==", "avatar_url": "https://avatars.githubusercontent.com/u/583231?v=4", "gravatar_id": "", "url": "https://api.github.com/users/octocat", "html_url": "https://github.com/octocat", "followers_url": "https://api.github.com/users/octocat/followers", "following_url": "https://api.github.com/users/octocat/following{/other_user}", "gists_url": "https://api.github.com/users/octocat/gists{/gist_id}", "starred_url": "https://api.github.com/users/octocat/starred{/owner}{/repo}", "subscriptions_url": "https://api.github.com/users/octocat/subscriptions", "organizations_url": "https://api.github.com/users/octocat/orgs", "repos_url": "https://api.github.com/users/octocat/repos", "events_url": "https://api.github.com/users/octocat/events{/privacy}", "received_events_url": "https://api.github.com/users/octocat/received_events", "type": "User", "site_admin": false, "name": "The Octocat", "company": "@github", "blog": "https://github.blog", "location": "San Francisco", "email": null, "hireable": null, "bio": null, "twitter_username": null, "public_repos": 8, "public_gists": 8, "followers": 14882, "following": 9, "created_at": "2011-01-25T18:44:36Z", "updated_at": "2024-08-22T11:25:04Z" }
This function updates the cache with new data and ETags when the resource changes.
Optimize ETag for Static files
For static files such as font and CSS files, you can use file metadata to generate ETags:
import os import time def generate_static_file_etag(file_path): stat = os.stat(file_path) return f"{stat.st_mtime}-{stat.st_size}" file_path = "style.css" etag = generate_static_file_etag(file_path) print(f"ETag for {file_path}: {etag}")
Output:
ETag for style.css: 1726048509.5163145-250
This function generates an ETag based on the file’s modification time and size, suitable for static files.
Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.