[Hardening PAW-Lite Transitions] Harden PAW-Lite transitions and preset adherence#307
Open
[Hardening PAW-Lite Transitions] Harden PAW-Lite transitions and preset adherence#307
Conversation
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Tighten semantic review-complete matching and keep deliverable guardrail docs assertions aligned with current specification wording. Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
lossyrob
commented
May 6, 2026
Owner
Author
lossyrob
left a comment
There was a problem hiding this comment.
PAW Review Summary
The bulk of feedback below concentrates on two SQL-backed mechanisms this PR introduces that, on close reading, weaken rather than strengthen the gates they are meant to harden:
- The boundary UPSERT silently preserves stale
status='done'on resume/re-entry (MUST). - The new prefix-filtered work-completion gate flips a previously fail-closed check to fail-open on prefix mismatch and on empty result sets (MUST).
The remainder are vocabulary/specification gaps in paw-transition, a coverage concern that the new test files assert prose rather than behavior (which is what allows the two MUST findings to slip through), and a few maintainability follow-ups.
Findings: 2 Must-address, 7 Should-address, 3 Could-consider (12 inline comments; 1 additional Could-consider was skipped per critique as speculative).
Open questions for the author
- For the work-completion gate: do you want the empty-set work-TODO state to fail closed (recommended) or fail open with an explicit attestation only when an out-of-band check has been made?
- Is "Per Review Strategy" at
skills/paw-transition/SKILL.md:53a typo for "Per Review Policy", or does Review Strategy genuinely gate pause decisions in some path I missed? - For the prose-vs-behavior tests: would you prefer the behavioral conversion to land in this PR, or as a follow-up tracked against
tests/integration/? The boundary-UPSERT and work-gate regression tests should land here regardless.
🐾 Review generated with PAW Review
Reset boundary TODO status on upsert, fail closed when no work TODO or no-work attestation exists, and document/cover safe work-id validation before SQL-like TODO filtering. Ref: #307 (comment) Ref: #307 (comment) Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>
Add missing planning-docs-review transition rows, use Review Policy terminology consistently, enumerate final-review configuration fields, and make obligation_summary the source for TODO suffixes with explicit none/default cases. Ref: #307 (comment) Ref: #307 (comment) Ref: #307 (comment) Ref: #307 (comment) Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>
Move the runtime-marker denylist and config-only section checks into a shared test helper, then reuse it across guardrail, paw-init, and lifetime tests. Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>
Document why standard PAW uses transition obligation_summary while PAW-Lite uses boundary briefs and SQL TODOs for the same configured-obligation contract. Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>
Keep the new PAW-Lite work-gate regression fixture type-safe under the integration test TypeScript configuration. Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>
Update PAW-Lite content expectations for the shared config-only wording and boundary upsert status reset contract. Ref: #307 (comment) Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>
Accept equivalent runtime wording that blocks implementation until planning docs review executes, avoiding a brittle phrase-only assertion. Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>
Broaden SDK-driven assertions to accept equivalent wording for planning-docs-review gating and final-review findings resolution. Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>
Accept SDK boundary wording that identifies direct implementation as the incorrect shortcut when planning docs review is enabled. Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #306.
Hardens PAW-Lite and standard PAW transition behavior so configured obligations remain visible at stage boundaries without turning generated
WorkflowContext.mdartifacts into mutable runtime state.Changes
paw-init, including resolved-configuration summaries that agents can read without appending control state to generated artifacts.paw-transitionoutput with additive obligation summaries for review policy, planning-docs review, final review, and final PR routing.Final review
Multi-model final review ran with
claude-opus-4.7-xhighandgpt-5.5using smart resolution. Three coverage/documentation findings were auto-applied. The remaining artifact-lifecycle finding was resolved by the finalcommit-and-cleanstop-tracking commit, so.paw/work/hardening-paw-lite-transitions/is not present in the PR-visible diff.Validation
npm run compilenpm run lintbash scripts/lint-prompting.sh --allpython -m mkdocs build --strictcd tests/integration && npm run test:integration:skillsArtifacts
Workflow artifacts are available at the last artifact commit:
af0923f.Breaking changes
None.
Deployment
No deployment or migration steps required.
🐾 Generated with PAW