[Hardening PAW-Lite Transitions] Harden PAW-Lite transitions and preset adherence by lossyrob · Pull Request #307 · lossyrob/phased-agent-workflow

lossyrob · 2026-05-05T01:11:53Z

Summary

Fixes #306.

Hardens PAW-Lite and standard PAW transition behavior so configured obligations remain visible at stage boundaries without turning generated WorkflowContext.md artifacts into mutable runtime state.

Changes

Adds config-only WorkflowContext guardrails to paw-init, including resolved-configuration summaries that agents can read without appending control state to generated artifacts.
Adds named PAW-Lite boundary checkpoints for planning-docs review, implementation, final review, and final PR handoff, with explicit configured obligations and incorrect-shortcut warnings.
Persists PAW-Lite boundary/work TODO categories in SQL so stage obligations survive summarization and final-PR admission can be enforced.
Tightens standard paw-transition output with additive obligation summaries for review policy, planning-docs review, final review, and final PR routing.
Expands integration coverage for PAW-Lite boundary chains, TODO filtering, final PR handoff, generated WorkflowContext lifetime, and standard transition lifetime behavior.
Updates user guide, reference, and specification docs to describe the hardened transition and artifact-lifecycle behavior.

Final review

Multi-model final review ran with claude-opus-4.7-xhigh and gpt-5.5 using smart resolution. Three coverage/documentation findings were auto-applied. The remaining artifact-lifecycle finding was resolved by the final commit-and-clean stop-tracking commit, so .paw/work/hardening-paw-lite-transitions/ is not present in the PR-visible diff.

Validation

npm run compile
npm run lint
bash scripts/lint-prompting.sh --all
python -m mkdocs build --strict
cd tests/integration && npm run test:integration:skills
Targeted workflow regression group passed 34/34, covering PAW-Lite boundary chains, planning-docs routing, TODO filtering, preset obligations, review policy, final PR handoff, transition obligations, paw-init WorkflowContext generation, and WorkflowContext lifetime guardrails.

Artifacts

Workflow artifacts are available at the last artifact commit: af0923f.

Breaking changes

None.

Deployment

No deployment or migration steps required.

🐾 Generated with PAW

Co-authored-by: Copilot <[email protected]>

Tighten semantic review-complete matching and keep deliverable guardrail docs assertions aligned with current specification wording. Co-authored-by: Copilot <[email protected]>

Co-authored-by: Copilot <[email protected]>

lossyrob

PAW Review Summary

The bulk of feedback below concentrates on two SQL-backed mechanisms this PR introduces that, on close reading, weaken rather than strengthen the gates they are meant to harden:

The boundary UPSERT silently preserves stale status='done' on resume/re-entry (MUST).
The new prefix-filtered work-completion gate flips a previously fail-closed check to fail-open on prefix mismatch and on empty result sets (MUST).

The remainder are vocabulary/specification gaps in paw-transition, a coverage concern that the new test files assert prose rather than behavior (which is what allows the two MUST findings to slip through), and a few maintainability follow-ups.

Findings: 2 Must-address, 7 Should-address, 3 Could-consider (12 inline comments; 1 additional Could-consider was skipped per critique as speculative).

Open questions for the author

For the work-completion gate: do you want the empty-set work-TODO state to fail closed (recommended) or fail open with an explicit attestation only when an out-of-band check has been made?
Is "Per Review Strategy" at skills/paw-transition/SKILL.md:53 a typo for "Per Review Policy", or does Review Strategy genuinely gate pause decisions in some path I missed?
For the prose-vs-behavior tests: would you prefer the behavioral conversion to land in this PR, or as a follow-up tracked against tests/integration/? The boundary-UPSERT and work-gate regression tests should land here regardless.

🐾 Review generated with PAW Review

Reset boundary TODO status on upsert, fail closed when no work TODO or no-work attestation exists, and document/cover safe work-id validation before SQL-like TODO filtering. Ref: #307 (comment) Ref: #307 (comment) Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>

Add missing planning-docs-review transition rows, use Review Policy terminology consistently, enumerate final-review configuration fields, and make obligation_summary the source for TODO suffixes with explicit none/default cases. Ref: #307 (comment) Ref: #307 (comment) Ref: #307 (comment) Ref: #307 (comment) Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>

Move the runtime-marker denylist and config-only section checks into a shared test helper, then reuse it across guardrail, paw-init, and lifetime tests. Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>

Document why standard PAW uses transition obligation_summary while PAW-Lite uses boundary briefs and SQL TODOs for the same configured-obligation contract. Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>

Keep the new PAW-Lite work-gate regression fixture type-safe under the integration test TypeScript configuration. Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>

Update PAW-Lite content expectations for the shared config-only wording and boundary upsert status reset contract. Ref: #307 (comment) Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>

Accept equivalent runtime wording that blocks implementation until planning docs review executes, avoiding a brittle phrase-only assertion. Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>

Broaden SDK-driven assertions to accept equivalent wording for planning-docs-review gating and final-review findings resolution. Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>

Accept SDK boundary wording that identifies direct implementation as the incorrect shortcut when planning docs review is enabled. Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>

lossyrob and others added 14 commits May 4, 2026 12:31

Initialize PAW workflow for Hardening PAW-Lite Transitions

753823c

Co-authored-by: Copilot <[email protected]>

Add PAW planning artifacts for issue 306

8c3f5f1

Co-authored-by: Copilot <[email protected]>

Add WorkflowContext guardrails

7c9c7b3

Co-authored-by: Copilot <[email protected]>

Add PAW-Lite boundary checkpoints

f0017bf

Co-authored-by: Copilot <[email protected]>

Harden PAW-Lite review boundaries

e8028e0

Co-authored-by: Copilot <[email protected]>

Tighten PAW transition obligations

f139bdd

Co-authored-by: Copilot <[email protected]>

Consolidate PAW transition regression coverage

79c9310

Co-authored-by: Copilot <[email protected]>

Document PAW transition hardening

7e74417

Co-authored-by: Copilot <[email protected]>

Clarify implementation review handoff docs

0255b0a

Co-authored-by: Copilot <[email protected]>

Relax PAW-Lite final PR handoff assertion

cb6dff1

Co-authored-by: Copilot <[email protected]>

Polish PAW transition validation assertions

f442041

Tighten semantic review-complete matching and keep deliverable guardrail docs assertions aligned with current specification wording. Co-authored-by: Copilot <[email protected]>

Address final review coverage gaps

9e95dac

Co-authored-by: Copilot <[email protected]>

Record final review completion

af0923f

Co-authored-by: Copilot <[email protected]>

Stop tracking PAW artifacts for hardening-paw-lite-transitions

4b57360

Co-authored-by: Copilot <[email protected]>

lossyrob added the enhancement New feature or request label May 5, 2026

lossyrob commented May 6, 2026

View reviewed changes

lossyrob and others added 9 commits May 6, 2026 17:53

Address review comment: centralize WorkflowContext invariants

250154c

Move the runtime-marker denylist and config-only section checks into a shared test helper, then reuse it across guardrail, paw-init, and lifetime tests. Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>

Address review comment: document obligation surfaces

4ecdef8

Document why standard PAW uses transition obligation_summary while PAW-Lite uses boundary briefs and SQL TODOs for the same configured-obligation contract. Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>

Address review comment: type PAW-Lite TODO fixtures

ab7a135

Keep the new PAW-Lite work-gate regression fixture type-safe under the integration test TypeScript configuration. Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>

Address review comment: align PAW-Lite prompt snapshots

c7bd5aa

Update PAW-Lite content expectations for the shared config-only wording and boundary upsert status reset contract. Ref: #307 (comment) Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>

Address review comment: relax PAW-Lite routing assertion

200e0d8

Accept equivalent runtime wording that blocks implementation until planning docs review executes, avoiding a brittle phrase-only assertion. Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>

Address review comment: harden PAW-Lite runtime assertions

9780fb0

Broaden SDK-driven assertions to accept equivalent wording for planning-docs-review gating and final-review findings resolution. Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>

Address review comment: broaden planning review gate assertion

3a766b5

Accept SDK boundary wording that identifies direct implementation as the incorrect shortcut when planning docs review is enabled. Ref: #307 (comment) Co-authored-by: Copilot <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Hardening PAW-Lite Transitions] Harden PAW-Lite transitions and preset adherence#307

[Hardening PAW-Lite Transitions] Harden PAW-Lite transitions and preset adherence#307
lossyrob wants to merge 23 commits intomainfrom
feature/hardening-paw-lite-transitions

lossyrob commented May 5, 2026

Uh oh!

lossyrob left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lossyrob commented May 5, 2026

Summary

Changes

Final review

Validation

Artifacts

Breaking changes

Deployment

Uh oh!

lossyrob left a comment

Choose a reason for hiding this comment

PAW Review Summary

Open questions for the author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant