Skip to content

fix(kg): reject inverted intervals in add_triple (valid_to < valid_from)#1214

Open
arnoldwender wants to merge 1 commit intoMemPalace:developfrom
arnoldwender:fix/kg-temporal-inversion-guard
Open

fix(kg): reject inverted intervals in add_triple (valid_to < valid_from)#1214
arnoldwender wants to merge 1 commit intoMemPalace:developfrom
arnoldwender:fix/kg-temporal-inversion-guard

Conversation

@arnoldwender
Copy link
Copy Markdown
Contributor

What and Why

A triple with valid_to before valid_from satisfies neither of the temporal filter clauses in KnowledgeGraph.query_entity():

valid_from <= as_of AND valid_to >= as_of

so the triple is invisible to every query — silently corrupt. The data lives in SQLite forever but never surfaces, and the caller never sees a warning. This is a P0 data-integrity bug in a path adapters can hit easily (any caller that mixes up the two date params).

Root Cause

mempalace/knowledge_graph.py:149add_triple() accepts valid_from and valid_to as opaque strings, validates each format independently in upstream callers (and now PR #1167 at the MCP boundary), but never checks the relationship between them.

Fix

Reject at write time with a clear ValueError naming both bounds. The guard fires only when both are set:

  • Open intervals (only valid_from, or only valid_to) — accepted unchanged.
  • Same-day intervals (valid_from == valid_to, point-in-time facts) — explicitly allowed (< not <=).
  • String comparison is correct because sanitize_iso_date (PR fix(kg): validate ISO-8601 date formats at MCP boundary #1167) ensures YYYY-MM-DD lex order matches calendar order.

Test plan

  • test_add_triple_rejects_inverted_interval — asserts ValueError with 'before valid_from' match
  • test_add_triple_accepts_equal_dates — point-in-time facts pass
  • test_add_triple_allows_only_one_bound — open intervals (one bound) still pass
  • All 24 test_knowledge_graph.py tests green
  • Full suite: 1318 passed locally (2 pre-existing failures in test_backends.py::test_pin_hnsw_threads* unrelated — environmental ChromaDB configuration_json["hnsw"] schema drift, also fails on a clean develop checkout)

A triple with valid_to < valid_from satisfies neither of the temporal
filter clauses in query_entity():

    valid_from <= as_of AND valid_to >= as_of

so the triple is invisible to every query — silently corrupt. Reject
at write time with a clear error instead of letting bad data pile up
in the SQLite store.

The guard only fires when both bounds are present; open intervals
(only valid_from or only valid_to) are still accepted, and same-day
intervals (valid_from == valid_to, point-in-time facts) are explicitly
allowed.
@jphein
Copy link
Copy Markdown
Collaborator

jphein commented Apr 27, 2026

+1 to rejecting at write time — silently invisible triples are a hard failure mode to debug after the fact.

One coordination note worth surfacing: lex-string valid_to < valid_from is correct only once both sides are canonical YYYY-MM-DD. Counterexample without sanitize_iso_date on the boundary:

"2026-12-01" < "2026-3-01"  # True — the guard would falsely reject Dec→Mar as inverted

The PR description names the dependency on #1167. Just noting that if #1167 stalls and #1214 lands first, MCP tool_kg_add callers passing non-zero-padded months would hit spurious ValueError. Tests in this PR all use canonical form so they don't exercise the gap.

Low-risk in practice (LLM-driven callers tend to emit ISO 8601), and the right fix shape is #1167 anyway. Not blocking — just worth tagging the dependency in case the merge order goes #1214#1167 rather than the other way.

@arnoldwender
Copy link
Copy Markdown
Contributor Author

@jphein Thanks — good call on tagging the merge-order gap explicitly. Concrete behavior if #1214 lands before #1167:

  • MCP tool_kg_add callers passing non-zero-padded months like "2026-3-01" would hit ValueError: valid_to < valid_from even on valid intervals, because lex-string "2026-12-01" < "2026-3-01" is True.
  • LLM-driven callers tend to emit canonical ISO 8601, so practical exposure is low.
  • The right fix is fix(kg): validate ISO-8601 date formats at MCP boundary #1167 (sanitize_iso_date at the MCP boundary), not duplicate validation here.

Happy with either merge order. If #1167 first, the gap closes naturally; if #1214 first, the limitation is documented and #1167 follows up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants