Rethinking HTTP API Rate Limiting: A Client-Side Approach

📅 2025-10-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In multi-client environments sharing HTTP API rate limits, the absence of a global load view leads to inefficient retries, surging 429 (Too Many Requests) errors, and increased operational costs. Method: This paper proposes a decentralized, adaptive client-side rate-limiting mechanism that requires no centralized coordination. We introduce two distributed algorithms: ATB (Adaptive Throttling Backoff), based on offline policies, and AATB (Adaptive Aggregated Telemetry Backoff), which integrates real-time aggregated telemetry. Both algorithms enable lightweight, feedback-driven inference of system congestion and dynamically optimize retry timing. ATB is deployed in service workers; AATB augments it with live telemetry. Results: Evaluated under realistic traffic and synthetic workloads, our approach reduces 429 error rates by up to 97.3% versus exponential backoff in a 100-client scenario, significantly improving overall system efficiency and cost-effectiveness.

Technology Category

Application Category

📝 Abstract
HTTP underpins modern Internet services, and providers enforce quotas to regulate HTTP API traffic for scalability and reliability. When requests exceed quotas, clients are throttled and must retry. Server-side enforcement protects the service. However, when independent clients' usage counts toward a shared quota, server-only controls are inefficient; clients lack visibility into others' load, causing their retry attempts to potentially fail. Indeed, retry timing is important since each attempt incurs costs and yields no benefit unless admitted. While centralized coordination could address this, practical limitations have led to widespread adoption of simple client-side strategies like exponential backoff. As we show, these simple strategies cause excessive retries and significant costs. We design adaptive client-side mechanisms requiring no central control, relying only on minimal feedback. We present two algorithms: ATB, an offline method deployable via service workers, and AATB, which enhances retry behavior using aggregated telemetry data. Both algorithms infer system congestion to schedule retries. Through emulations with real-world traces and synthetic datasets with up to 100 clients, we demonstrate that our algorithms reduce HTTP 429 errors by up to 97.3% compared to exponential backoff, while the modest increase in completion time is outweighed by the reduction in errors.
Problem

Research questions and friction points this paper is trying to address.

Addresses inefficient HTTP API rate limiting with shared quotas
Reduces excessive client retries and associated costs
Proposes decentralized algorithms to infer system congestion
Innovation

Methods, ideas, or system contributions that make the work stand out.

Client-side adaptive mechanisms without central control
ATB algorithm deployable via service workers
AATB using telemetry data to infer congestion
🔎 Similar Papers
No similar papers found.