Skip to content

fix: rewrite response model field back to request model#2645

Open
GinBan1 wants to merge 1 commit into
farion1231:mainfrom
GinBan1:fix/response-model-rewrite
Open

fix: rewrite response model field back to request model#2645
GinBan1 wants to merge 1 commit into
farion1231:mainfrom
GinBan1:fix/response-model-rewrite

Conversation

@GinBan1
Copy link
Copy Markdown

@GinBan1 GinBan1 commented May 8, 2026

Summary

When the proxy maps models (e.g. claude-sonnet-4-6mimo-v2.5-pro), the upstream API response contains the mapped model name. This causes Claude Code's /context command to display the wrong model name, since it reads from local config rather than the API response.

This PR rewrites the model field in the response back to the original request model, so the client always sees the model name it configured. The proxy's model mapping remains fully functional for routing — only the response metadata is adjusted.

Changes

response_processor.rs

  • Non-streaming passthrough: After decompression, parse JSON response, rewrite model field to ctx.request_model, re-serialize, and strip Content-Length header
  • Streaming passthrough: Intercept SSE message_start events and rewrite model field in the message object before yielding bytes to client
  • Updated create_logged_passthrough_stream signature to accept request_model: Option<String>

handlers.rs

  • Transform mode: After OpenAI/Gemini → Anthropic conversion, rewrite model field before serialization
  • Updated stream creation call to pass request_model

Motivation

Users who switch between providers (e.g. from Anthropic to Xiaomi MiMo) via cc-switch want Claude Code's /context to display the correct model name. Currently, /context always shows the original Claude model name (e.g. "Sonnet 4.6") because Claude Code reads from its local config, not from the API response. With this fix, the response model field matches what Claude Code expects, so the display is always correct.

Test plan

  • Non-streaming request with model mapping: verify response model matches request model
  • Streaming request with model mapping: verify SSE message_start event model matches request model
  • Request without model mapping: verify no-op (model already matches)
  • Non-streaming transform mode: verify converted response model matches request model

When the proxy maps models (e.g. claude-sonnet-4-6 → mimo-v2.5-pro),
the upstream API response contains the mapped model name. This causes
Claude Code's /context to display the wrong model name, since it reads
from local config, not from the API response.

This fix rewrites the `model` field in the response back to the
original request model, so the client always sees the model name it
configured. Three code paths are covered:

- Non-streaming passthrough (response_processor.rs)
- Streaming passthrough via SSE message_start rewrite (response_processor.rs)
- Non-streaming transform mode (handlers.rs)
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fbf2295584

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

// 尝试解析并记录完整的 SSE 事件
let mut chunk = bytes.to_vec();
while let Some(event_text) = take_sse_block(&mut buffer) {
let event_start_in_buffer = buffer.len();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Compute SSE rewrite offset from emitted chunk, not buffer tail

The rewrite index is derived from buffer.len() after take_sse_block drains the event, so it points to the remaining tail rather than the event's position in the current chunk. When a network chunk contains a full message_start event plus trailing bytes (for example, the start of the next SSE event), line_start_in_buffer is shifted forward and chunk.splice can replace the wrong byte range, producing malformed SSE/JSON output for clients. This breaks streaming responses intermittently depending on chunk boundaries.

Useful? React with 👍 / 👎.

Comment on lines +266 to +271
if let Ok(mut json_val) = serde_json::from_slice::<Value>(&body_bytes) {
if json_val.get("model").and_then(|m| m.as_str()) == Some(ctx.request_model.as_str()) {
// model 未被改写,无需处理
} else if json_val.as_object_mut().is_some() {
json_val["model"] = json!(ctx.request_model);
if let Ok(new_bytes) = serde_json::to_vec(&json_val) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Parse usage before overwriting response model in non-streaming path

The response body is rewritten to ctx.request_model before usage/model extraction, so TokenUsage::from_*_response and the fallback json_value["model"] now observe the rewritten model rather than the upstream provider model. In log_usage_internal, pricing can still be based on model when pricing_model_source != "request", which means mapped-model accounting becomes inaccurate for non-streaming calls after this change.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant