Finding
The core retry path in agent.ts handles 429 responses with local exponential backoff but does not consult the Retry-After header.
// Local exponential backoff — no Retry-After consulted
Math.min(1000 * 2 ** attempt, 10000)
Affected lines: 2039, 3267, 5884
isRetryableError() defaults to true, so all 429s enter the same retry path regardless of whether the signal is transient throttling (WAIT) or quota exhaustion (STOP).
The consequence
Under shared provider contention, independently-running agents built on VoltAgent can converge their retry windows instead of dispersing them.
Because this sits in the framework retry layer, every downstream agent inherits the behavior automatically. This stays invisible in light usage and surfaces during provider stress or concurrency spikes.
Expected behavior
- Read
Retry-After when present
- Use header value as minimum retry delay
- Distinguish quota exhaustion (STOP) from transient
throttling (WAIT) — do not retry quota exhaustion
- Fall back to exponential backoff only when header absent
Suggested fix shape
if (response.status === 429) {
const retryAfter = response.headers?.get('retry-after');
if (retryAfter) {
const parsed = parseInt(retryAfter, 10);
if (!isNaN(parsed) && parsed > 0) {
await new Promise(resolve => setTimeout(resolve, parsed * 1000));
continue;
}
}
// Fall back to exponential backoff
await new Promise(resolve =>
setTimeout(resolve, Math.min(1000 * 2 ** attempt, 10000))
);
}
Related pattern
retry-after-ignored-under-concurrency
Corpus reference: https://github.com/SirBrenton/pitstop-truth
Finding
The core retry path in
agent.tshandles 429 responses with local exponential backoff but does not consult theRetry-Afterheader.Affected lines: 2039, 3267, 5884
isRetryableError()defaults to true, so all 429s enter the same retry path regardless of whether the signal is transient throttling (WAIT) or quota exhaustion (STOP).The consequence
Under shared provider contention, independently-running agents built on VoltAgent can converge their retry windows instead of dispersing them.
Because this sits in the framework retry layer, every downstream agent inherits the behavior automatically. This stays invisible in light usage and surfaces during provider stress or concurrency spikes.
Expected behavior
Retry-Afterwhen presentthrottling (WAIT) — do not retry quota exhaustion
Suggested fix shape
Related pattern
retry-after-ignored-under-concurrencyCorpus reference: https://github.com/SirBrenton/pitstop-truth