Airbnb has upgraded the traffic management architecture of its multi-tenant key-value store, Mussel, replacing static per-client rate limits with a fully adaptive, resource-aware system. The redesign aims to maintain service quality during traffic spikes, protect critical workflows, and ensure fair usage across thousands of tenants.
Mussel originally relied on a Redis-backed counter enforcing fixed queries-per-second (QPS) limits per client. While effective for basic isolation, it did not account for the true cost of requests or adapt quickly to shifting workloads. High-variance traffic patterns, large data uploads, promotion-driven bursts, or spikes from automated processes exposed the limitations of static caps.
Progression of Traffic Management Strategies (Source: Airbnb Tech Blog)
Shravan Gaonkar, Casey Getz, and Wonhee Cho, engineers at Airbnb, explain the importance of traffic management on Mussel:
Mussel handles millions of predictable reads, but during peak events, it must absorb higher volumes, terabyte-scale uploads, and sudden bursts from bots or DDoS attacks. Serving this volatile traffic reliably is critical to both Airbnb’s user experience and the stability of the services powering the platform.
The first component of the redesign is a resource-aware rate controller measuring requests in request units (RU), which account for rows processed, bytes read or written, and latency. Token buckets at each dispatcher refill periodically, deducting RU costs per request. When tokens are exhausted, the dispatcher returns HTTP 429 responses. Local enforcement removes cross-node coordination, enabling fast, independent decisions within milliseconds.
Latency response over time and illustration of throttling (Source: Airbnb Tech Blog)
Load shedding was added to stabilize the system during sudden traffic spikes. Dispatchers compute a latency ratio using the P² quantile estimation algorithm, comparing short-term and long-term p95 latency without storing full samples. Combined with internal queue pressure signals and client traffic classification, dispatchers may delay or drop requests to protect overall throughput when backend resources approach saturation.
Airbnb also introduced a hot-key defense layer. Using in-memory top-k detection and local LRU caches, dispatchers identify keys receiving disproportionate traffic. When a key becomes hot, the dispatcher serves cached results and coalesces concurrent requests, reducing load on backend shards.
Hotkeys detected and served from dispatcher cache in real time(Source: Airbnb Tech Blog)
Airbnb reported that during a controlled DDoS drill targeting a small set of keys at roughly one million QPS, the hot-key layer reduced the burst to a trickle. Each dispatcher forwarded only occasional requests, keeping the load well below the capacity of individual shards and preventing backend overload.
The engineering team emphasizes that keeping control loops local has been key to Mussel’s evolution. Resource-aware rate control, load shedding, and hot-key management operate entirely within each dispatcher, eliminating cross-node coordination. This design allows fast, independent decisions, linear scaling under stress, and reliable protection even during peak events, illustrating how layered, cost-aware QoS enables both resilience and fairness in handling volatile traffic.
Airbnb plans further improvements, including automatic quota tuning from historical usage and tighter database integration via resource groups, aiming to enhance both throughput and service isolation in Mussel.
