Skip to content

[Hardening PAW-Lite Transitions] Harden PAW-Lite transitions and preset adherence#307

Open
lossyrob wants to merge 23 commits intomainfrom
feature/hardening-paw-lite-transitions
Open

[Hardening PAW-Lite Transitions] Harden PAW-Lite transitions and preset adherence#307
lossyrob wants to merge 23 commits intomainfrom
feature/hardening-paw-lite-transitions

Conversation

@lossyrob
Copy link
Copy Markdown
Owner

@lossyrob lossyrob commented May 5, 2026

Summary

Fixes #306.

Hardens PAW-Lite and standard PAW transition behavior so configured obligations remain visible at stage boundaries without turning generated WorkflowContext.md artifacts into mutable runtime state.

Changes

  • Adds config-only WorkflowContext guardrails to paw-init, including resolved-configuration summaries that agents can read without appending control state to generated artifacts.
  • Adds named PAW-Lite boundary checkpoints for planning-docs review, implementation, final review, and final PR handoff, with explicit configured obligations and incorrect-shortcut warnings.
  • Persists PAW-Lite boundary/work TODO categories in SQL so stage obligations survive summarization and final-PR admission can be enforced.
  • Tightens standard paw-transition output with additive obligation summaries for review policy, planning-docs review, final review, and final PR routing.
  • Expands integration coverage for PAW-Lite boundary chains, TODO filtering, final PR handoff, generated WorkflowContext lifetime, and standard transition lifetime behavior.
  • Updates user guide, reference, and specification docs to describe the hardened transition and artifact-lifecycle behavior.

Final review

Multi-model final review ran with claude-opus-4.7-xhigh and gpt-5.5 using smart resolution. Three coverage/documentation findings were auto-applied. The remaining artifact-lifecycle finding was resolved by the final commit-and-clean stop-tracking commit, so .paw/work/hardening-paw-lite-transitions/ is not present in the PR-visible diff.

Validation

  • npm run compile
  • npm run lint
  • bash scripts/lint-prompting.sh --all
  • python -m mkdocs build --strict
  • cd tests/integration && npm run test:integration:skills
  • Targeted workflow regression group passed 34/34, covering PAW-Lite boundary chains, planning-docs routing, TODO filtering, preset obligations, review policy, final PR handoff, transition obligations, paw-init WorkflowContext generation, and WorkflowContext lifetime guardrails.

Artifacts

Workflow artifacts are available at the last artifact commit: af0923f.

Breaking changes

None.

Deployment

No deployment or migration steps required.

🐾 Generated with PAW

lossyrob and others added 14 commits May 4, 2026 12:31
Tighten semantic review-complete matching and keep deliverable guardrail docs assertions aligned with current specification wording.

Co-authored-by: Copilot <[email protected]>
@lossyrob lossyrob added the enhancement New feature or request label May 5, 2026
Copy link
Copy Markdown
Owner Author

@lossyrob lossyrob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PAW Review Summary

The bulk of feedback below concentrates on two SQL-backed mechanisms this PR introduces that, on close reading, weaken rather than strengthen the gates they are meant to harden:

  1. The boundary UPSERT silently preserves stale status='done' on resume/re-entry (MUST).
  2. The new prefix-filtered work-completion gate flips a previously fail-closed check to fail-open on prefix mismatch and on empty result sets (MUST).

The remainder are vocabulary/specification gaps in paw-transition, a coverage concern that the new test files assert prose rather than behavior (which is what allows the two MUST findings to slip through), and a few maintainability follow-ups.

Findings: 2 Must-address, 7 Should-address, 3 Could-consider (12 inline comments; 1 additional Could-consider was skipped per critique as speculative).

Open questions for the author

  1. For the work-completion gate: do you want the empty-set work-TODO state to fail closed (recommended) or fail open with an explicit attestation only when an out-of-band check has been made?
  2. Is "Per Review Strategy" at skills/paw-transition/SKILL.md:53 a typo for "Per Review Policy", or does Review Strategy genuinely gate pause decisions in some path I missed?
  3. For the prose-vs-behavior tests: would you prefer the behavioral conversion to land in this PR, or as a follow-up tracked against tests/integration/? The boundary-UPSERT and work-gate regression tests should land here regardless.

🐾 Review generated with PAW Review

Comment thread skills/paw-lite/SKILL.md
Comment thread skills/paw-lite/SKILL.md
Comment thread skills/paw-lite/SKILL.md
Comment thread skills/paw-transition/SKILL.md
Comment thread skills/paw-transition/SKILL.md Outdated
Comment thread tests/integration/tests/workflows/paw-lite-boundary-chain.test.ts
Comment thread tests/integration/tests/skills/workflow-context-guardrails.test.ts
Comment thread skills/paw-lite/SKILL.md
Comment thread skills/paw-init/SKILL.md
Comment thread skills/paw-transition/SKILL.md Outdated
lossyrob and others added 9 commits May 6, 2026 17:53
Reset boundary TODO status on upsert, fail closed when no work TODO or no-work attestation exists, and document/cover safe work-id validation before SQL-like TODO filtering.

Ref: #307 (comment)
Ref: #307 (comment)
Ref: #307 (comment)

Co-authored-by: Copilot <[email protected]>
Add missing planning-docs-review transition rows, use Review Policy terminology consistently, enumerate final-review configuration fields, and make obligation_summary the source for TODO suffixes with explicit none/default cases.

Ref: #307 (comment)
Ref: #307 (comment)
Ref: #307 (comment)
Ref: #307 (comment)
Ref: #307 (comment)

Co-authored-by: Copilot <[email protected]>
Move the runtime-marker denylist and config-only section checks into a shared test helper, then reuse it across guardrail, paw-init, and lifetime tests.

Ref: #307 (comment)

Co-authored-by: Copilot <[email protected]>
Document why standard PAW uses transition obligation_summary while PAW-Lite uses boundary briefs and SQL TODOs for the same configured-obligation contract.

Ref: #307 (comment)

Co-authored-by: Copilot <[email protected]>
Keep the new PAW-Lite work-gate regression fixture type-safe under the integration test TypeScript configuration.

Ref: #307 (comment)

Co-authored-by: Copilot <[email protected]>
Update PAW-Lite content expectations for the shared config-only wording and boundary upsert status reset contract.

Ref: #307 (comment)
Ref: #307 (comment)

Co-authored-by: Copilot <[email protected]>
Accept equivalent runtime wording that blocks implementation until planning docs review executes, avoiding a brittle phrase-only assertion.

Ref: #307 (comment)

Co-authored-by: Copilot <[email protected]>
Broaden SDK-driven assertions to accept equivalent wording for planning-docs-review gating and final-review findings resolution.

Ref: #307 (comment)

Co-authored-by: Copilot <[email protected]>
Accept SDK boundary wording that identifies direct implementation as the incorrect shortcut when planning docs review is enabled.

Ref: #307 (comment)

Co-authored-by: Copilot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hardening PAW-Lite transitions and preset adherence without embedded control state

1 participant