fix: don't retry on QueueingError (Veo/Lyria async queueing is not a transient error)#314
Open
chiuweilun1107 wants to merge 1 commit intoHanaokaYuzu:masterfrom
Open
Conversation
…transient error) When `generate_content` hits `Stream suspended (queueing=True)`, the server has accepted the request and is processing it asynchronously (e.g. Veo video rendering). The current code raises `APIError`, which the `@running(retry=5)` decorator treats as a transient error and retries — but each retry sends a **new request**, creating a new server-side job and burning an additional daily quota slot. Empirically observed: within 45 seconds the decorator fired 4 retries, creating 4 independent Veo conversations visible in the web UI. Each consumed a separate daily quota slot. Fix: introduce `QueueingError(GeminiError)` and raise it instead of `APIError` when `is_queueing=True`. Since the decorator only retries `APIError` (not `GeminiError`), queueing errors now bubble up immediately with zero retries. Callers can catch `QueueingError` and switch to a poll-based flow (list_chats + read_chat) to retrieve the result. Non-queueing stream suspensions (`is_queueing=False`) still raise `APIError` and are retried as before — this preserves the existing recovery behaviour for transient connection issues. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Contributor
|
Gemini only actually receives the request and begins processing it when a CID appears, and when a CID is present, the system waits until a result is available. Therefore, the analysis in this PR is incorrect. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When
generate_contentencountersStream suspended (queueing=True), the server has accepted the request and is processing it asynchronously (e.g. Veo video rendering, Lyria music generation). The current code raisesAPIErrorat this point, which the@running(retry=5)decorator treats as a transient error and retries.Each retry sends a new request, creating a new server-side job (new conversation) and burning an additional daily quota slot.
Empirical evidence
Tested with Veo 3 video generation prompts:
Stream suspendedfires within ~12s of each attempt — faster than the first backoffFix
Introduce
QueueingError(GeminiError)— a new exception that inherits fromGeminiErrorinstead ofAPIError.When
is_queueing=Trueat the point of stream suspension:QueueingErrorinstead ofAPIError@runningdecorator only retriesAPIError, soQueueingErrorbubbles up immediately with zero retriesQueueingErrorand switch to a poll-based flow (list_chats()+read_chat()) to retrieve the result once the server finishes renderingWhen
is_queueing=False(transient connection issues, cookie drift, etc.):APIErroras before — existing retry behaviour is preservedChanges
exceptions.pyQueueingError(GeminiError)client.pyQueueingError; raise it instead ofAPIErrorwhenis_queueing=Truein the stream-suspended branchNo changes to
decorators.py— the fix works purely through the exception hierarchy.Backward compatibility
APIErrorwill not catchQueueingError— this is intentional (the retry was harmful, not helpful)GeminiErrorwill catchQueueingError(since it's a subclass)Exceptionis unaffectedUsage example (after this fix)
🤖 Generated with Claude Code