[safe-output-health] Safe Output Health Report — 2026-05-11 #31430
Closed
Replies: 1 comment
-
|
This discussion has been marked as outdated by Safe Output Health Monitor. A newer discussion is available at Discussion #31638. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Executive Summary
Overall safe-output health is good — only 2 of 70 (~2.9%) processed messages produced a hard failure, and both surfaced clean diagnostic messages. Two further events were soft failures that the handlers already handle gracefully; one of those (the empty
submit_pull_request_review) silently drops the action and warrants a small fix.Safe Output Job Statistics
add_commentcreate_issuecreate_discussioncreate_pull_requestcreate_pull_request_review_commentsubmit_pull_request_reviewresolve_pull_request_review_threadset_issue_fieldpush_to_pull_request_branchadd_labels,remove_labels,add_reviewer,upload_artifact,upload_asset,update_pull_request,dispatch_workflow,create_code_scanning_alert,post_slack_message,set_issue_type,missing_tool,missing_data,noopError Clusters
Cluster 1 —
resolve_pull_request_review_thread: Resource not accessible by integrationCount: 1 occurrence
Affected workflow: Smoke Claude (run §25649467832)
Sample error:
Root cause: The default
GITHUB_TOKENissued for the smoke workflow does not have permission to call the GraphQLresolveReviewThreadmutation on the target PR. The mutation requirespull-requests: writeplus that the token's app is allowed to mutate that thread; the smoke workflow's token scope is missing the latter.Impact: Smoke Claude marks the run as failed at the job level even though 13/14 messages succeeded. Real workflows are unlikely to hit this often because most don't try to resolve threads, but the smoke matrix exercises it intentionally.
Cluster 2 —
set_issue_field: No issue fields discoveredCount: 1 occurrence
Affected workflow: Smoke Codex (run §25649467850)
Sample error:
Root cause: The repository does not have any custom Issue Field schemas exposed to the smoke workflow's token, but the agent emitted a
set_issue_fieldrequest anyway. The handler correctly errors out, but the diagnostic is not actionable for the agent.Impact: The Codex smoke matrix step that exercises
set_issue_fieldalways fails ongithub/gh-awbecause the feature is not configured here. This is a known smoke-test coverage gap, not a regression.Cluster 3 —
create_pull_request: workflows-permission rejection (graceful fallback)Count: 1 occurrence
Affected workflow: Q (run §25650191988)
Sample error:
Root cause: The patch produced by the Q workflow touched
.github/workflows/weekly-blog-post-writer.md, but the GitHub App token doing the push does not haveworkflows: write. The push is rejected by GitHub's protected-paths rule.Impact: Already handled — the safe-output handler falls back to opening a review issue ([q] fix(weekly-blog-post-writer): grep blog posts to avoid duplicate Agent of the Week #31412) so no work is lost. Final processing summary reports
Failed: 0. This is the system working as designed.Cluster 4 —
submit_pull_request_review: 422 on empty review (silent drop)Count: 1 occurrence
Affected workflow: Test Quality Sentinel (run §25650796757)
Sample error:
Root cause: The agent emitted two
submit_pull_request_reviewmessages — one with no body/event and one with body+event=APPROVE. The handler merged review state and submitted the first (empty) one. GitHub REST returns 422 whenevent=COMMENTis paired with empty body and no comments.Subtle bug: After the 422, the handler logs the error and the warning, but the processing summary still reports
Successful: 2 / Failed: 0because the per-message success was set when the message was accepted, before the deferred submit. The actual PR review is not created.Root Cause Analysis
Permission / Scope Issues (2 of 4)
resolve_pull_request_review_thread— token scope.create_pull_request— workflow path push protection (handled gracefully).Configuration Issues (1 of 4)
set_issue_field— feature not enabled for this repo.Validation / Handler Bugs (1 of 4)
submit_pull_request_review— handler accepts an empty review and submits it without validation; the eventual 422 is not counted as a failure.Recommendations
Critical Issues (Immediate Action)
None. No safe-output regression is causing widespread failures.
View Bug Fixes Required
submit_pull_request_reviewshould validate before submit and count 422 as a failure✓ Message N (submit_pull_request_review) completed successfullyat acceptance time, then submits in a finalization pass. A failure at finalization is logged but not reflected in theFailed:counter.event=COMMENTwith empty body and zero inline comments before queuing; (b) when the deferred submit fails, set a non-zero failure exit and include it in thesafe output(s) failed:list.set_issue_fieldshould differentiate "feature disabled" from "transient failure"View Configuration Changes
resolve_pull_request_review_threadandset_issue_fieldon environments where they are guaranteed to failView Process Improvements
submit_pull_request_reviewfinalization failuresWork Item Plans
Work Item 1 — Fix submit_pull_request_review silent-failure accounting
submit_pull_request_reviewmessages get accepted, then fail at finalization with HTTP 422, but the processing summary still reportsFailed: 0. PR review is not created.event+ body + comments before queueing and returns a clear error forevent=COMMENTwith no content.safe output(s) failed:list and reflected in theFailed:counter.submit_pull_request_reviewqueueing path, rejectevent=COMMENTwhenbody==='' && comments.length===0. In the finalization helper, on error update the per-message status from success → failure and increment the failure counter.Work Item 2 — Improve set_issue_field diagnostic when feature is disabled
listIssueTypes/listProjectFieldscall; if empty, route subsequentset_issue_fieldmessages to a graceful-skip path.Work Item 3 — Make smoke workflows capability-aware
resolve_pull_request_review_thread(token scope) andset_issue_field(feature off), which masks real failures in those engines.missing_toolentry rather than failing.Failed: 0on a fully-green path.Historical Context
This is the first audit captured in
safe-output-health/cache memory, so no trend comparison is available. Subsequent audits will compare against2026-05-11.json.Metrics and KPIs
add_comment,add_labels,create_issue,create_discussion,push_to_pull_request_branch— 100% success.resolve_pull_request_review_thread(1/1 failed),set_issue_field(1/1 failed),submit_pull_request_review(1/3 silently failed).create_pull_requestcorrectly fell back to an issue when the workflows-permission rule rejected the push.Next Steps
submit_pull_request_reviewfailures).set_issue_fieldgraceful skip when feature disabled).2026-05-11.jsonto confirm the silentsubmit_pull_request_reviewfailure is now counted.References:
Beta Was this translation helpful? Give feedback.
All reactions