Files
system-design-101/data/guides/how-do-we-retry-on-failures.md
Kamran Ahmed ee4b7305a2 Adds ByteByteGo guides and links (#106)
This PR adds all the guides from [Visual
Guides](https://bytebytego.com/guides/) section on bytebytego to the
repository with proper links.

- [x] Markdown files for guides and categories are placed inside
`data/guides` and `data/categories`
- [x] Guide links in readme are auto-generated using
`scripts/readme.ts`. Everytime you run the script `npm run
update-readme`, it reads the categories and guides from the above
mentioned folders, generate production links for guides and categories
and populate the table of content in the readme. This ensures that any
future guides and categories will automatically get added to the readme.
- [x] Sorting inside the readme matches the actual category and guides
sorting on production
2025-03-31 22:16:44 -07:00

2.8 KiB

title, description, image, createdAt, draft, categories, tags
title description image createdAt draft categories tags
Retry Strategies for System Failures Explore retry strategies for handling transient errors in distributed systems. https://assets.bytebytego.com/diagrams/0229-how-do-we-retry-on-failures.png 2024-02-23 false
cloud-distributed-systems
distributed systems
retry strategies

In distributed systems and networked applications, retry strategies are crucial for handling transient errors and network instability effectively. The diagram shows 4 common retry strategies.

Linear Backoff

Linear backoff involves waiting for a progressively increasing fixed interval between retry attempts.

  • Advantages: Simple to implement and understand.

  • Disadvantages: May not be ideal under high load or in high-concurrency environments as it could lead to resource contention or "retry storms".

Linear Jitter Backoff

Linear jitter backoff modifies the linear backoff strategy by introducing randomness to the retry intervals. This strategy still increases the delay linearly but adds a random "jitter" to each interval.

  • Advantages: The randomness helps spread out the retry attempts over time, reducing the chance of synchronized retries across instances.

  • Disadvantages: Although better than simple linear backoff, this strategy might still lead to potential issues with synchronized retries as the base interval increases only linearly.

Exponential Backoff

Exponential backoff involves increasing the delay between retries exponentially. The interval might start at 1 second, then increase to 2 seconds, 4 seconds, 8 seconds, and so on, typically up to a maximum delay. This approach is more aggressive in spacing out retries than linear backoff.

  • Advantages: Significantly reduces the load on the system and the likelihood of collision or overlap in retry attempts, making it suitable for high-load environments.

  • Disadvantages: In situations where a quick retry might resolve the issue, this approach can unnecessarily delay the resolution.

Exponential Jitter Backoff

Exponential jitter backoff combines exponential backoff with randomness. After each retry, the backoff interval is exponentially increased, and then a random jitter is applied. The jitter can be either additive (adding a random amount to the exponential delay) or multiplicative (multiplying the exponential delay by a random factor).

  • Advantages: Offers all the benefits of exponential backoff, with the added advantage of reducing retry collisions even further due to the introduction of jitter.

  • Disadvantages: The randomness can sometimes result in longer than necessary delays, especially if the jitter is significant.