fix: centralize embedding model config to prevent query/ingest mismatch by OmkarKirpan · Pull Request #912 · MemPalace/mempalace

OmkarKirpan · 2026-04-15T12:10:20Z

Summary

Closes #903

Adds centralized embedding model configuration so the MCP server, CLI search, and all ingest paths use the same model — fixing silent query failures when models mismatch.

Design Decisions

Storage approach: Embedding model name stored in ChromaDB collection metadata (not a separate file). Atomic with the collection, can't desync. Absence of the key = legacy palace.
Default for new palaces: all-mpnet-base-v2 (768-dim) — better search quality (+3.5pp on LoCoMo R@10 benchmarks over MiniLM).
Default for existing palaces: all-MiniLM-L6-v2 (384-dim) — backwards compatible, no re-mining required. Detected by absence of embedding_model key in collection metadata.
Resolution chain: Collection metadata (authoritative) > config file / env var (new palaces only) > built-in default. This means once a palace is created, its model is locked in and self-describing.
No migration tool in this PR: Re-embedding existing palaces from MiniLM to mpnet is a separate concern. This PR prevents the mismatch; migrating existing palaces is a follow-up.
All create paths stamp the model: Repair, rebuild, and migrate operations preserve the original model through the delete/recreate cycle.

Resolution chain

1. Collection metadata "embedding_model" key (authoritative, stamped at build)
2. If absent → legacy palace → all-MiniLM-L6-v2
3. config.json "embedding_model" or MEMPALACE_EMBEDDING_MODEL env var → new palace creation only

Files changed

New: mempalace/embedding.py — model registry, resolution, embedding function factory
Modified: mempalace/config.py — embedding_model property on MempalaceConfig
Modified: mempalace/backends/chroma.py — get_collection(), get_or_create_collection(), create_collection() accept embedding_function + embedding_model_name
Modified: mempalace/palace.py — resolves model from metadata on read, stamps on create
Modified: mempalace/mcp_server.py — _get_collection() uses correct embedding function, tool_status() reports active model
Modified: mempalace/cli.py, mempalace/repair.py, mempalace/migrate.py — all collection-create paths now stamp embedding model

Known follow-ups (not in scope)

Performance: double get_collection call on read paths for metadata resolution (~ms overhead, within budget)
Migration tool to re-embed existing MiniLM palaces with mpnet
OpenAI/external embedding support (Add MemPalace OpenAI 3072-dim embedding support or something similar opensource. #756)
GPU acceleration (feat: GPU-accelerated embeddings via optional sentence-transformers #515)

Test plan

936/936 tests pass locally
Legacy palace (no metadata key) resolves to MiniLM
New palace creation stamps mpnet in collection metadata
Config override and env var override work
Repair/migrate preserve embedding model through rebuild
mempalace_status reports active embedding model
ruff check and ruff format clean

Single source of truth for embedding model resolution. Resolves from collection metadata, falls back to MiniLM for legacy palaces. New palaces default to all-mpnet-base-v2 (768-dim). Part of MemPalace#903

Resolves from config.json or MEMPALACE_EMBEDDING_MODEL env var. Used for new palace creation only; existing palaces read from collection metadata. Part of MemPalace#903

get_collection() and get_or_create_collection() now accept optional embedding_function and embedding_model_name params. Model name is stamped into collection metadata on create. Fully backwards compatible. Part of MemPalace#903

On create: stamps new_palace_model() (mpnet) into collection metadata. On read: resolves model from metadata, falls back to MiniLM for legacy. All collection access now uses the correct embedding function. Also fixes tests that opened bare PersistentClient instances without the correct embedding function, causing dimension mismatches (768 vs 384). Part of MemPalace#903

_get_collection() resolves the model from collection metadata and passes the correct embedding_function to ChromaDB. tool_status() reports the active embedding_model. Closes MemPalace#903

- ChromaBackend.create_collection() now accepts embedding_function and embedding_model_name params - cli.py repair, repair.py rebuild_index: read embedding model from existing collection before delete/recreate, preserve it - migrate.py: stamp new_palace_model() on migrated palaces - palace.get_collection(): accept optional config param so CLI mining respects config.json embedding_model setting - Update test_rebuild_index_success to verify new embedding args Addresses code review findings MemPalace#4, MemPalace#5, MemPalace#7 for MemPalace#903

Three Phase 2 fixes: 1. Embedding model guard — palace_meta.json (MemPalace#903/MemPalace#912): - Added embedding_model property to MempalaceConfig (default: all-MiniLM-L6-v2; env MEMPALACE_EMBEDDING_MODEL overrides). - write_palace_meta() in palace.py writes model name + timestamp to <palace>/palace_meta.json at the end of every mine run. - read_palace_meta() in palace.py reads it back at search time. - search_memories() in searcher.py compares the stored ingest model against the current config model; if they differ, a "warning" key is added to the search result and logged to stderr. - Non-fatal: old palaces without palace_meta.json get no warning. - Prevents silent garbage results when users switch embedding models. 2. silent_save config respected in hooks_cli.py (MemPalace#854): - hook_stop() now gates on MEMPAL_VERBOSE env var, matching the existing behavior in hooks/mempal_save_hook.sh. - Default (MEMPAL_VERBOSE unset): mine transcript in background, never block the AI or interrupt the conversation. - MEMPAL_VERBOSE=true: block with diary reason (developer mode). - Updated tests to reflect new default behavior. 3. sanitize_name() Unicode — confirmed already working; skipped (MemPalace#637). Python 3's re.UNICODE default makes [^\W_] match all Unicode letters. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>

NickShtefan · 2026-04-26T10:46:56Z

@OmkarKirpan — heads-up that #442 just got rebased onto develop (mergeStateStatus moved from DIRTY → UNSTABLE; CI running). My fixes overlap on the _get_collection mismatch bug (#903) but the scopes differ:

	#912	#442
MCP `_get_collection` reads model from collection metadata	✓	✓
`EmbeddingModelMismatchError`	—	✓ (raised + propagated through `handle_request`)
`mempalace init --model <name>`	—	✓ (model stamped at palace creation)
`mempalace re-mine --model <new>`	—	✓ (safe migration)
Auto-detect device (cuda > mps > cpu)	—	✓
`[multilingual]` extra	—	✓
Default for new palaces	mpnet-base-v2 (768)	configurable via `--model`
Backwards compat for legacy palaces	✓ (MiniLM fallback)	✓ (auto-stamped as `chromadb-default`)

#442 effectively contains your fix as a subset. If maintainers prefer the broader scope, #912 would be superseded. If they prefer the narrower minimal fix first, I can wait for #912 to merge and rebase #442 on top — happy to coordinate either way.

@igorls @bensig — would appreciate guidance on which sequencing you want.

OmkarKirpan added 6 commits April 15, 2026 16:07

feat: add embedding model registry (mempalace/embedding.py)

93cac5e

Single source of truth for embedding model resolution. Resolves from collection metadata, falls back to MiniLM for legacy palaces. New palaces default to all-mpnet-base-v2 (768-dim). Part of MemPalace#903

feat: add embedding_model property to MempalaceConfig

85f253d

Resolves from config.json or MEMPALACE_EMBEDDING_MODEL env var. Used for new palace creation only; existing palaces read from collection metadata. Part of MemPalace#903

feat: accept embedding_function in ChromaBackend

6669409

get_collection() and get_or_create_collection() now accept optional embedding_function and embedding_model_name params. Model name is stamped into collection metadata on create. Fully backwards compatible. Part of MemPalace#903

feat: MCP server uses correct embedding model per palace

95cc2fc

_get_collection() resolves the model from collection metadata and passes the correct embedding_function to ChromaDB. tool_status() reports the active embedding_model. Closes MemPalace#903

OmkarKirpan marked this pull request as ready for review April 15, 2026 12:15

OmkarKirpan requested review from bensig, igorls and milla-jovovich as code owners April 15, 2026 12:15

igorls added bug Something isn't working area/search Search and retrieval area/mcp MCP server and tools labels Apr 15, 2026

matrix9neonebuchadnezzar2199-sketch mentioned this pull request Apr 15, 2026

RFC: Synapse Phase 10–14 — Model Guard, Cross-Wing Balancing, Score Explainability, Adaptive Compaction, Paginated Scoring #914

Open

This was referenced Apr 26, 2026

feat: add configurable multilingual embedding model support #442

Open

bug: embedding model mismatch — MCP server uses MiniLM (384-dim) while ingest can use mpnet (768-dim) #903

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: centralize embedding model config to prevent query/ingest mismatch#912

fix: centralize embedding model config to prevent query/ingest mismatch#912
OmkarKirpan wants to merge 6 commits intoMemPalace:developfrom
OmkarKirpan:fix/embedding-model-mismatch

OmkarKirpan commented Apr 15, 2026 •

edited

Loading

Uh oh!

NickShtefan commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

OmkarKirpan commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Design Decisions

Resolution chain

Files changed

Known follow-ups (not in scope)

Test plan

Uh oh!

NickShtefan commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

OmkarKirpan commented Apr 15, 2026 •

edited

Loading