docs(requirements): add contextual user model requirements with scoring model rationale#786
docs(requirements): add contextual user model requirements with scoring model rationale#786tianjianjiang wants to merge 2 commits intomasterfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces an exceptionally detailed and well-structured requirements document for the Contextual User Model. The document thoroughly covers the problem statement, design, and a phased implementation plan, which will be a great asset for the project. My review includes one high-severity suggestion to address an inconsistency in the reporting of phase statuses to ensure clarity for all stakeholders.
There was a problem hiding this comment.
Critical Issues Found
I've identified 4 critical issues in this requirements document:
-
PR dependency ordering (line 94): The PR stack table doesn't reflect that this PR depends on #784 being merged first, creating merge strategy ambiguity.
-
Algorithm logic ambiguity (line 561): Contradictory description of fixedSpan blocking behavior - unclear whether it blocks all candidates at the fixed position or only positions inside the span.
-
KN formula correctness (line 786): The continuation probability normalization is incorrect - should normalize by sum of N+(w') for words in that reading context, not total unique bigrams N++. This affects all probability calculations.
-
User data loss risk (line 1012): Phase 4 is marked DONE and saves contextual model data, but Phase 6 migration (backing up UserPhrasesLM) is TODO. Users could lose existing phrase data if Phase 4 ships before migration is implemented.
The first three are documentation correctness issues. The fourth is a potential user data loss scenario that should be addressed before any phase ships to users.
There was a problem hiding this comment.
Pull request overview
Adds a comprehensive requirements.md document that specifies and tracks the phased design/implementation plan for introducing a contextual, persisted user preference model (KN-smoothed) to replace legacy score-hack mechanisms in McBopomofo.
Changes:
- Introduces a full multi-phase requirements/specification document (phases 0A–6) with goals, motivations, acceptance criteria, and test summaries.
- Documents design principles and architecture (session structural fixes, KN backoff scoring, persistence).
- Captures known limitations, deferred work, and a PR stack plan tying phases to related PRs/commits.
6106a7a to
e246baa
Compare
972339b to
218ed48
Compare
There was a problem hiding this comment.
Reviewed requirements.md for critical issues. Found 5 consistency/correctness problems that need resolution:
- PR #777 vs #779 inconsistency - Text references PR #777 but table shows #779
- Commit hash mismatch - Different commit hashes referenced for same work (9fa53a6 vs 9f9477d)
- Circular documentation reference - References "original Phase 1 requirements" within the requirements doc itself
- Phase dependency graph contradiction - Linear chain notation doesn't match stated parallel dependencies
- Missing PR stack table entry - This PR (#786) should be listed in the table
These are documentation correctness issues that will cause confusion for readers trying to understand the implementation history and phase relationships.
|
Claude Code Review Failed The automated Claude review encountered an error and could not complete. You can:
This does not affect the PR approval process. |
There was a problem hiding this comment.
Critical Issue: Commit References Don't Exist
The document claims phases 0A-4 are "DONE" with specific commits, but those commits don't exist in the repository. All referenced PRs (#779, #780, #781) are still open with different commit SHAs than documented.
This creates correctness issues for anyone relying on this requirements doc to understand implementation status or verify completed work.
e246baa to
c9154c8
Compare
There was a problem hiding this comment.
Critical Issues
This requirements document has several issues that could cause confusion about implementation status and architectural assumptions:
- Status inconsistencies: Features marked "DONE" reference PRs listed as "Open", creating ambiguity about what's actually implemented
- Test status misleading: All tests marked "(Pass)" for features that may not be merged yet
- Base branch assumptions: Document references commits and PRs that may not exist in the base branch ancestry
- Documentation standard violations: Timestamp format doesn't follow CLAUDE.md requirements
These issues don't affect code correctness but could mislead contributors and AI agents about the current state of the codebase.
c9154c8 to
8df0535
Compare
There was a problem hiding this comment.
Critical Issues Summary
Found 6 critical issues that need resolution:
1. Phase Status Inconsistency
FR-1 through FR-5 marked "TODO" in phase summary but claim implementation with "(Pass)" test results. Unclear what's actually done vs planned.
2. Non-existent PR References
PR stack references #779-785 as if they exist, but they're future PRs. Document credibility issue.
3. False File Path References
Multiple FRs reference walk_strategy.h/cpp and contextual_user_model.h/cpp that don't exist yet, using past-tense language implying they're implemented.
4. Missing Time Unit in Decay Formula
halfLife=20 lacks unit specification (selections? days? sessions?). Formula cannot be implemented correctly without this.
5. YAGNI Violation
FR-2 claims YAGNI but includes detailed specs for Pruned Viterbi, MMSEG, Segment Viterbi with specific parameters. Either implement them or remove the specs.
6. Memory Safety Issue Misclassified as "Low"
L4 describes stack-allocated LM wrapped in aliasing shared_ptr with no-op deleter - this is a use-after-free vulnerability, not a low-priority limitation. Should be blocker priority.
218ed48 to
988b086
Compare
8df0535 to
0ab7122
Compare
There was a problem hiding this comment.
Reviewed requirements.md for critical issues. Found 6 issues requiring attention before merge:
- PR reference inconsistency: PR #777 vs #779 for Viterbi walk implementation
- Ambiguous phase status: TODO markers conflict with implemented features
- Formula notation: Temporal decay formula needs clarification
- High-severity correctness issue: Dual model systems (UOM + ContextualUserModel) running simultaneously
- Test count verification: Ensure 54 tests matches actual suite
- Phase dependency error: Linear sequence doesn't match described parallel dependencies
The dual-model issue (L7) is particularly critical as it creates non-deterministic behavior and should be resolved before FR-6 is considered complete.
988b086 to
34143ba
Compare
0ab7122 to
edb593e
Compare
There was a problem hiding this comment.
Found 6 critical issues requiring fixes:
- FR-0 requirement ambiguity: "at minimum 7 techniques" vs "Seven techniques" inconsistency
- FR-6 test completeness: Verify 35 XCTests is the complete suite or add qualifier
- Phase 4 dependency error: FR-4 incorrectly depends on Phase 2 (should be standalone)
- PR stack mismatch: Description mentions #787, stack shows #783; numbering inconsistent
- Temporal decay unit missing: halfLife=20 in what units? (seconds/days/observations)
- Test numbering confusion: "10 tests (1.5.1-1.5.12)" is actually 12 items, not 10
All issues impact requirements correctness, implementation guidance, or internal consistency.
34143ba to
bd56fc6
Compare
edb593e to
d39c375
Compare
|
Claude Code Review Failed The automated Claude review encountered an error and could not complete. You can:
This does not affect the PR approval process. |
434ab66 to
24f7046
Compare
d39c375 to
05b59d5
Compare
|
Claude Code Review Failed The automated Claude review encountered an error and could not complete. You can:
This does not affect the PR approval process. |
…ng model rationale Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
05b59d5 to
bbf280f
Compare
|
Claude Code Review Failed The automated Claude review encountered an error and could not complete. You can:
This does not affect the PR approval process. |
|
Claude Code Review Failed The automated Claude review encountered an error and could not complete. You can:
This does not affect the PR approval process. |
Summary
requirements.mdwith 10 functional requirements (FR-0 through FR-9) using Job Stories and EARS acceptance criteriaalgorithm.mdto document both the Viterbi DP walk and the scoring model rationale: legacy score hack problem, seven-technique smoothing comparison, the "丼 problem" design breakthrough, and four-level backoff formulas<required>criteria use EARS syntax with no C++ implementation details#786 (independent, base: master)
Test plan
<required>)Generated with Claude Code