Skip to content

fix: add native Gemini embedding driver#997

Open
chethanuk wants to merge 3 commits intoRightNow-AI:mainfrom
chethanuk:feat/gemini-embedding-2
Open

fix: add native Gemini embedding driver#997
chethanuk wants to merge 3 commits intoRightNow-AI:mainfrom
chethanuk:feat/gemini-embedding-2

Conversation

@chethanuk
Copy link
Copy Markdown

This pull request introduces support for Gemini embedding models by adding API key auto-detection for Gemini and Google, implementing a new GeminiEmbeddingDriver, and configuring default model settings. It also includes minor refactorings for MCP transport configuration and formatting cleanups in the web fetch and search modules. Feedback was provided regarding the use of unwrap_or_default() when reading response bodies, which could mask underlying network or API errors.

Summary

  • add a native Gemini embedding driver for memory embeddings
  • detect when Gemini batch embedding access is unavailable and fall back to single embedContent requests
  • auto-detect Gemini embedding credentials and default model selection
  • update MCP transport config usage for rmcp 1.3 compatibility

Validation

  • cargo fmt --all -- --check
  • cargo test --workspace embedding --quiet
  • cargo clippy --workspace --all-targets -- -D warnings
  • cargo build --release -p openfang-cli
  • live local daemon verification with Gemini embeddings enabled
  • live verification of batch-to-single fallback behavior

New Features

  • Added Google Gemini as a supported embeddings provider with automatic API key detection (GEMINI_API_KEY and GOOGLE_API_KEY).
  • Updated default embedding model for Gemini/Google to a higher-capacity embedding.
  • Bug Fixes / Reliability

    • Improved embedding request behavior with automatic fallback from batch to single requests for greater robustness.
  • Tests

    • Simplified/clarified unit test assertions for readability.
  • Refactor

    • Minor code and configuration cleanups for HTTP client setup and formatting.

Copy link
Copy Markdown
Member

@jaberjaber23 jaberjaber23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Clean Gemini embedding driver with batch/single fallback, Zeroizing for API key security, proper error handling, and tests. Has merge conflicts — please rebase on main.

@jaberjaber23
Copy link
Copy Markdown
Member

This PR has merge conflicts. Please rebase onto the latest main branch and resolve conflicts so we can merge.

@jaberjaber23
Copy link
Copy Markdown
Member

Thanks @chethanuk — after reviewing both your PR and #1052, we'd like to land this one as the canonical Gemini embedding driver because it handles more real-world edge cases:

  • batch → single embedContent fallback with the atomic mode flag (Gemini's batch endpoint isn't available in all regions / plans)
  • GEMINI_API_KEY / GOOGLE_API_KEY both accepted in auto-detect
  • google alias in create_embedding_driver (matches how users configure it)
  • Default model gemini-embedding-2-preview

Two asks before merge:

  1. Rebase on current main — the branch shows CONFLICTING. fix(deps): upgrade wasmtime 41->43 and rumqttc 0.24->0.25 to resolve active CVEs #1041 just landed with wasmtime/rumqttc changes and Cargo.lock churn.
  2. Add an explicit reqwest::Client timeout on the Gemini client (reqwest::Client::new() has no timeout by default — a stuck endpoint will hang the driver). 30s connect + 60s total is a reasonable baseline.

Minor nit: on error bodies, unwrap_or_default() drops useful diagnostic info — consider .unwrap_or_else(|e| format!("<body read error: {e}>")) or similar.

Once rebased and green, we'll merge this. Will close #1052 as the smaller subset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants