By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Lessons on How to Get Timeouts, Retries and Idempotency Right From Sam Newman at QCon London
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > Lessons on How to Get Timeouts, Retries and Idempotency Right From Sam Newman at QCon London
News

Lessons on How to Get Timeouts, Retries and Idempotency Right From Sam Newman at QCon London

News Room
Last updated: 2025/04/09 at 1:07 PM
News Room Published 9 April 2025
Share
SHARE

At QCon London, Sam Newman – the architect who has attributed the coining of the term microservices, went back to the basics to underline the three critical things to get right when working with distributed systems: timeouts, retries and idempotency. Through the talk, he provided mechanisms allowing distributed systems to be more robust.

He started his presentation by “poking at” the quote “Insanity is doing the same thing over and over again and expecting different results.” stating that in many situations, especially when it comes to distributed systems, doing the same thing it’s advisable. Further, he underlined that developers shouldn’t do complex analyses of Paxos vs Raft vs SWIM and not even debate the nuances of the CAP theorem, but just to be able to wrap their heads around timeouts (“knowing when to give up”), retries (“how many times should I try again”) and idempotency (making it “a bit” safe).

Leslie Lamport: “A distributed system is one in which the failure of a computer you didn’t even know existed can render your computer unusable.”

To further frame the context, he enumerates the three “golden rules” of distributed systems:

  1. You can’t beam information between two points instantaneously
  2. Sometimes, you can’t reach the thing you want to talk to
  3. Resources are finite

Before delving into providing more insights on making distributed systems more robust, he stressed that this trio, taken together, underpins all the complexity hidden by distributed systems behind different abstractions.

Timeouts: A threshold after which a request will be terminated if not completed

The system uses computational resources(CPUs, threads, or memory) when waiting, regardless of blocking or non-blocking IO. Waiting “a lot” means overflowing your system with requests, translating into “stuff falling over.” Besides, the user experience might also degrade: how long will the customer wait for the action to be finished?

It’s challenging to get the proper timeout right. To avoid timing out too quickly or waiting too long, you need to mainly understand two things: how long things usually take for your system to be executed and what the user’s expectations from the system (“When are they starting to fit the refresh button of the page”). Besides finding the proper value for the timeout, it’s essential for more consistent system behaviour. In that case, allocating resources will ensure that the duration of the calls falls within a more compact time frame. Also, the system should allow changes in the value timeouts without recompiling or redeploying the system.

Newman: “Timeouts are about prioritising system health over the success of a single request”.

Retries

Like timeouts, choosing the proper number of retries is also challenging. Too many retries would be similar to a self-inflicted DoS attack. To make systems more resilient, you must implement rate limiting on the client and server-side mechanisms to share excess load. Also, introducing an artificial network jitter (random-valued delays between retries) would ensure your systems have time to recover from failures. Newman warns against introducing exponential backoff, as that will put more pressure on your system than release it.

 

Idempotency: the property of an operation to be applied multiple times without changing the result.

The last fundamental pillar of distributed systems is ensuring that it’s safe to retry calls. If the first two pillars focus on what the clients need to do to make the systems safer, the last one is all about behaviour on the server side. According to Newman, there are two possibilities if a client doesn’t receive a response from a server:

  • The request didn’t go through. Hence, the server didn’t have anything to process. In this case, there is no problem.
  • The request was processed, but the response didn’t reach the customer. The system already operated the change, but the customer wasn’t notified.

Idempotency is easy to implement upfront but harder to retrofit. He mentions two ways of implementing it: using a request ID, which multiple major cloud providers use, but it also requires changes on the client side.

The alternative fingerprinting of the request ensures that the changes are isolated on the server side. You need to ensure that the fingerprint is based on consistent information between requests(avoid timestamps, which should be part of the header in the first place) but also to be timebound. Another consideration is that you must notify the customer that other previous requests were processed, and a good place to place that information is in the metadata.

 

When the request’s body might be changed, it is better to implement both mechanisms.

Newman closed the presentation by stating that in the case of distributed systems, doing the same thing repeatedly is eminently sensible, but to a point when you can make those retries safe and by humorously pointing out that his quote is falsely attributed to Albert Einstein.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article These Denon’s true wireless look like great options for budget buyers
Next Article 10 Best Fleet Management Software to Boost Operations in 2025
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Let's Talk About the 'Ironheart' Finale and Post-Credits Scene
News
Xiaomi denies CEO Lei Jun reduced stake amid share placement · TechNode
Computing
The Best Short Throw and Ultra Short Throw Projectors We’ve Tested (July 2025)
News
AT&T Launches Account Lock to Combat SIM Swapping Scams
News

You Might also Like

News

Let's Talk About the 'Ironheart' Finale and Post-Credits Scene

10 Min Read
News

The Best Short Throw and Ultra Short Throw Projectors We’ve Tested (July 2025)

41 Min Read
News

AT&T Launches Account Lock to Combat SIM Swapping Scams

6 Min Read
News

1MinAI is an all-in-one AI tool—get it for life for just $40

2 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?