Skip to content

Benchmark: Verify LoCoMo category assignments match source code, not paper #4

@bm-clawd

Description

@bm-clawd

Context

MemMachine's benchmark blog discovered that LoCoMo category assignments in the paper differ from the source code:

'This finding suggests that some public LoCoMo results might be presenting misclassified data, making a direct and fair comparison challenging.'

They use the source code assignments as ground truth, not the paper's descriptions.

Action

  1. Compare our category assignments against the LoCoMo source code (github.com/snap-research/LoCoMo)
  2. Document any discrepancies with the paper
  3. Ensure our per-category results use the correct assignments
  4. If our categories were wrong, re-run and report corrected numbers

This is important for credibility — if we publish numbers with wrong categories, competitors will call it out.

Related

Milestone

v0.19.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions