Skip to content

repair --mode max-seq-id crashes with ValueError on BLOB-typed embeddings.seq_id rows #1254

@NickShtefan

Description

@NickShtefan

What happened?

mempalace repair --mode max-seq-id (added in #1135 to recover palaces poisoned by the old _fix_blob_seq_ids shim) crashes during the dry-run inspection on palaces where embeddings.seq_id rows are still in chromadb 1.5.x's native BLOB format (8-byte big-endian uint64).

_compute_heuristic_seq_id (mempalace/repair.py:552) computes the recovery value via:

SELECT MAX(e.seq_id)
FROM embeddings e
JOIN segments s ON e.segment_id = s.id
WHERE s.collection = (SELECT collection FROM segments WHERE id = ?)

then calls int(row[0]) at line 576. When MAX(e.seq_id) returns a BLOB (which it does on palaces where chromadb 1.5.x has been writing seq_ids natively), int(bytes) raises ValueError.

What did you expect?

The recovery tool should successfully detect and report the heuristic value for poisoned segments without crashing on BLOB-typed embeddings.seq_id. After dry-run, the user should see what would be repaired and have the option to apply.

Specifically:

  • _compute_heuristic_seq_id should accept both INTEGER and BLOB return types from MAX(e.seq_id)
  • BLOB should be decoded as 8-byte big-endian uint64 (matching chromadb's storage format)
  • Dry-run should print detected poisoned rows and the proposed clean values, exit 0

How to reproduce:

  1. Have a palace where embeddings.seq_id has at least some BLOB-typed rows. In my case all 11684 rows had been migrated by mempalace's narrowed _fix_blob_seq_ids shim, but at least one raw row remained (chromadb 1.5.x writes new seq_ids in native BLOB format).

  2. Run dry-run repair:

    $ mempalace repair --mode max-seq-id --dry-run
    
    Traceback (most recent call last):
      File ".../mempalace/__main__.py", line 5, in <module>
        main()
      File ".../mempalace/cli.py", line 1484, in main
        dispatch[args.command](args)
      File ".../mempalace/cli.py", line 813, in cmd_repair
        repair_max_seq_id(...)
      File ".../mempalace/repair.py", line 677, in repair_max_seq_id
        new_val = _compute_heuristic_seq_id(cur, seg_id)
      File ".../mempalace/repair.py", line 576, in _compute_heuristic_seq_id
        return int(row[0])
    ValueError: invalid literal for int() with base 10: b'\x00\x00\x00\x00\x00\x00-\xae'
    
      =====================================================
        MemPalace Repair — max_seq_id Un-poison
      =====================================================
        Palace: /Users/nick/.mempalace/palace
    
  3. The bytes b'\x00\x00\x00\x00\x00\x00\x2D\xAE' are 0x2DAE = 11694 — the correct heuristic value, but stored as BLOB rather than INTEGER. The function computed it correctly but couldn't parse the storage format.

Environment:


Impact

The recovery tool added in #1135 is the only documented path to un-poison palaces hit by the original PR #664 shim bug. When the heuristic crashes, the user has no official recovery: repair --mode legacy (the alternative) is itself buggy (see separate issue about embedding_model preservation), and manual SQL UPDATE on max_seq_id is undocumented.

This affects every user who:

  1. Was on a mempalace version with the old PR fix: auto-repair BLOB seq_ids from chromadb 0.6→1.5 migration #664 shim before fix: narrow _fix_blob_seq_ids + add repair --mode max-seq-id #1135 merged (2026-04-27)
  2. Now wants to upgrade and recover writes

That's the population that #1135 was specifically designed to help.

Reference comparison

_read_sidecar_seq_ids (repair.py:579-598) does explicitly guard against BLOB:

for segment_id, seq_id, kind in rows:
    if kind == "blob":
        raise ValueError(
            f"Sidecar has BLOB-typed seq_id for {segment_id}; refusing to use it. "
            "Pass a sidecar that was already migrated to INTEGER rows."
        )

So the author knew BLOB was a possibility — gap is just in _compute_heuristic_seq_id.

Suggested fix

Three-line change in mempalace/repair.py:

def _compute_heuristic_seq_id(cur: sqlite3.Cursor, segment_id: str) -> int:
    row = cur.execute(...).fetchone()
    if row is None or row[0] is None:
        return 0
    val = row[0]
    if isinstance(val, (bytes, bytearray)):
        val = int.from_bytes(val, "big")
    return int(val)

Or stricter: query MAX(CAST(e.seq_id AS INTEGER)) to push the conversion into SQLite (matches what migration 00005-max-seq-id-int.sqlite.sql does). But the Python-side fix is simpler and has the same effect.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions