By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Uber Moves from Static Limits to Priority-Aware Load Control for Distributed Storage
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > Uber Moves from Static Limits to Priority-Aware Load Control for Distributed Storage
News

Uber Moves from Static Limits to Priority-Aware Load Control for Distributed Storage

News Room
Last updated: 2026/01/29 at 11:40 AM
News Room Published 29 January 2026
Share
Uber Moves from Static Limits to Priority-Aware Load Control for Distributed Storage
SHARE

Uber engineers have described how they evolved their distributed storage platform from static rate limiting to a priority-aware load management system to protect their in-house databases. The change addressed the limitations of QPS-based rate limiting in large, stateful, multi-tenant systems, which did not reflect actual load, handle noisy neighbors, or protect tail latency.

The design protects Docstore and Schemaless, built on MySQL® and serving traffic through thousands of microservices supporting over 170 million monthly active users, including riders, Uber Eats users, drivers, and couriers. By prioritizing critical traffic and adapting dynamically to system conditions, the system prevents cascading overloads and maintains performance at scale.

Uber engineers noted that early quota-based approaches relied on static limits enforced through centralized tracking but proved ineffective. Stateless routing layers lacked timely visibility into partition-level load, and requests of similar size imposed varying CPU, memory, or I/O costs. Operators frequently retuned limits, sometimes shedding healthy traffic while overloaded partitions remained unprotected. 

As Dhyanam V. Uber Engineer noted in a LinkedIn post,

Overload protection in stateful databases is a multi-dimensional problem at scale.

To address this, Uber colocated load management with stateful storage nodes, combining Controlled Delay (CoDel) queuing with a per-tenant Scorecard. CoDel adjusted queue behavior based on latency, while Scorecard enforced concurrency limits, and additional regulators monitored I/O, memory, goroutines, and hotspots. CoDel treated all requests equally, dropping both low-priority and user-facing traffic, which increased the on-call load and negatively impacted user experience. It also relied on fixed queue timeouts and static in-flight limits, which could trigger thundering herd retries and drop high-priority requests. While it prevented catastrophic failures, the system lacked the dynamism and nuance required for consistent performance, highlighting the need for priority-aware queues.

Load manager setup with CoDel queue (Source : Uber Blog Post)

The next evolution introduced Cinnamon. This priority-aware load shedder assigns requests to ranked tiers, allowing lower-priority traffic to be dropped before latency-sensitive operations are affected. Cinnamon dynamically tunes in-flight limits and queue timeouts using high-percentile latency metrics, reducing dependence on static thresholds and enabling smoother degradation during overload events.

Load manager setup with Cinnamon queue (Source : Uber Blog Post)

Uber later unified local and distributed overload signals into a single modular control loop using a “Bring Your Own Signal” model. This architecture allows teams to plug in both node-level indicators, such as in-flight concurrency and memory pressure, and cluster-level signals, including follower commit lag, into a centralized admission control path. Consolidating these signals eliminated fragmented control logic and avoided conflicting load-shedding decisions seen in earlier token bucket based systems.

According to Uber, the results have been substantial. Under overload conditions, throughput increased by approximately 80 percent, while P99 latency for upsert operations dropped by around 70 percent. The system also reduced goroutine counts by roughly 93 percent and lowered peak heap usage by about 60 percent, improving overall efficiency and reducing operational toil.

Uber highlights key lessons from its load management evolution: prioritize critical user-facing traffic, shedding lower-priority requests first; reject requests early to maintain predictable latencies and reduce memory pressure; use PID-based regulation for stability; place control near the source of truth; adapt dynamically to workloads; maintain observability; and favor simplicity to ensure consistent, resilient operation under stress.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article AI-driven SEO Architecture & Scalable Search Visibility AI-driven SEO Architecture & Scalable Search Visibility
Next Article Birdfy Feeder Vista gives you a panoramic view of avian eating habits Birdfy Feeder Vista gives you a panoramic view of avian eating habits
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

The HackerNoon Newsletter: AI Is Lowering the Entrance Fee to Imagination (1/29/2026) | HackerNoon
The HackerNoon Newsletter: AI Is Lowering the Entrance Fee to Imagination (1/29/2026) | HackerNoon
Computing
'F1: The Movie' is officially the most-watched film Apple TV history
'F1: The Movie' is officially the most-watched film Apple TV history
News
Lloyds Banking Group predicts AI will add over £100m in value in 2026 – UKTN
Lloyds Banking Group predicts AI will add over £100m in value in 2026 – UKTN
News
DDR5-4800 vs. DDR5-6000 Performance With The AMD Ryzen 7 9850X3D In 300+ Benchmarks Review
DDR5-4800 vs. DDR5-6000 Performance With The AMD Ryzen 7 9850X3D In 300+ Benchmarks Review
Computing

You Might also Like

'F1: The Movie' is officially the most-watched film Apple TV history
News

'F1: The Movie' is officially the most-watched film Apple TV history

1 Min Read
Lloyds Banking Group predicts AI will add over £100m in value in 2026 – UKTN
News

Lloyds Banking Group predicts AI will add over £100m in value in 2026 – UKTN

2 Min Read
With ‘Auto Browse’ on Google Chrome, Gemini Can Search the Internet for You
News

With ‘Auto Browse’ on Google Chrome, Gemini Can Search the Internet for You

8 Min Read
Samsung’s ‘Wide Fold’ phone could come out this summer to compete with iPhone Fold
News

Samsung’s ‘Wide Fold’ phone could come out this summer to compete with iPhone Fold

2 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?