fix(agents): generalize rate-limit retry for all LLM providers (#3882)#3944
fix(agents): generalize rate-limit retry for all LLM providers (#3882)#3944guoyangzhen wants to merge 5 commits intocamel-ai:masterfrom
Conversation
|
Code review is billed via overage credits. To resume reviews, an organization admin can raise the monthly limit in Settings → Usage. Once credits are available, reopen this pull request to trigger a review. |
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
zechengz
left a comment
There was a problem hiding this comment.
IMO this should be moved to the model requests part
| if status_code == 429: | ||
| return True |
There was a problem hiding this comment.
i think not all 429 is rate limit
There was a problem hiding this comment.
Good catch! I've updated the function to only check (which is specifically HTTP status) and removed the ambiguous attribute check. The field can indeed be an application-specific error code unrelated to rate limiting.
The updated logic:
- — OpenAI's specific error class
- — HTTP-specific status only (not generic )
- Error message matching — fallback for providers that don't set status_code
|
Good point — not all HTTP 429 responses are rate limits. I'll refine the detection to be more precise:
I'll push an update shortly. |
|
Good point. I've tightened the check to be more conservative:
The key insight: HTTP 429 from providers like Anthropic/OpenAI comes wrapped in their typed exceptions — we should match on the type, not the status code alone. Pushing the fix now. |
Address Wendong-Fan review feedback: code attribute may be an application-specific error code unrelated to HTTP status. Only check status_code (HTTP-specific) for 429 detection.
|
Edit: the backticks got stripped above. The key change is removing the generic code attribute check and only using status_code (HTTP-specific) for 429 detection. |
|
Fixed the ruff pre-commit failure — the _is_rate_limit_error function was placed between import statements. Moved it after all top-level imports. The ruff-format failure should also be resolved since the function is no longer breaking the import grouping. Pushed in 911b2c5. Could you re-run CI? |
|
Hi @Wendong-Fan, I've addressed all the feedback from the initial review:
The PR is now MERGEABLE. Could you take another look when you get a chance? Thanks! |
Summary
Fixes #3882
The retry logic in
ChatAgent.step()only catches OpenAI'sRateLimitError. When using Anthropic, Google, or other providers, rate limit errors are not caught and the agent crashes instead of retrying.Changes
Add
_is_rate_limit_error()helper that detects rate limits from any provider:RateLimitError(instanceof check)status_code == 429orcode == 429Replace
except RateLimitErrorblocks (both sync and async) withexcept Exceptionthat uses the helper to decide retry vs re-raiseBehavior
Now works for Anthropic, Google, Mistral, and any other provider that returns HTTP 429.