[audit-workflows] Daily audit — 2026-05-10 #31382
Replies: 2 comments
-
|
💥 WHOOSH! 🦸♀️ The smoke test agent ZOOMS through the audit discussion! ⚡
Run §25641551150 — engine nominal, signing off! 🚀💨 Warning Firewall blocked 6 domainsThe following domains were blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "accounts.google.com"
- "android.clients.google.com"
- "clients2.google.com"
- "contentautofill.googleapis.com"
- "safebrowsingohttpgateway.googleapis.com"
- "www.google.com"See Network Configuration for more information.
|
Beta Was this translation helpful? Give feedback.
-
|
This discussion has been marked as outdated by Agentic Workflow Audit Agent. A newer discussion is available at Discussion #31591. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Overview
Across the last 24 hours, 215 workflow runs were observed for
github/gh-aw, with 2 activation failures (0.93% failure rate) and $27.23 in estimated agent cost. All failures occurred before 14:00 UTC; the most recent 17:00–21:00 UTC window was 100% green. Two recurring concerns deserve attention: an outdated lock file forschema-consistency-checker.md, and 17% firewall block rate on theDaily Project Performance Summary Generatorworkflow.Summary
Critical Issues
1. Schema Consistency Checker — outdated lock file⚠️
Run §25621677191 failed at activation with:
Action: run
gh aw compileand commit the regeneratedschema-consistency-checker.lock.yml. Until that lands, every scheduled run of this workflow will skip the agent stage.2. Auto-Triage Issues — transient DNS failure 🌐
Run §25630347638 failed at activation:
Isolated occurrence — other Auto-Triage Issues runs in the window completed successfully. No code change recommended; if recurrence increases, consider adding a retry to the activation
git fetch.High-Severity Observability Signals
Network friction hotspot — Daily Project Performance Summary Generator
This workflow had 16 blocked requests out of 93 (17%) — the highest pressure of any workflow in the window. Blocked domains:
accounts.google.com:443(4)redirector.gvt1.com:443(4)clients2.google.com(3)www.google.com:443(3)release-assets.githubusercontent.com:443(1)nodejs.org:443(1)The
nodejs.organdrelease-assets.githubusercontent.comblocks suggest a toolchain bootstrap step. The Google-domain blocks look like an unintended Chrome/Puppeteer egress. Review whether these are required (and if so, allowlist them) or trim the script that triggers them.Daily Cache Strategy Analyzer — direct GitHub egress blocked
2 blocks on
api.github.com:443/github.com:443. Workflows should route GitHub reads through the GitHub MCP server instead of direct API calls.Execution drift — Test Quality Sentinel
Varied from 4 to 28 turns across runs (avg 16). Suggests unstable prompt shape or fluctuating task scope; worth a prompt review to tighten the agent loop.
Trend Charts
Workflow health
All failures are clustered in the early UTC hours (06:00 and 13:00). Once scheduled daily workflows started firing at 17:00 UTC, success rate held at 100% through the rest of the window. The early-hour failures explain the dip in success rate, not a systemic regression.
Token usage and cost
Cost rises sharply through the 17:00–21:00 UTC daily-job window. The 19:00 UTC peak ($8.18) is driven entirely by
[aw] Failure Investigator (6h), and the 21:00 UTC bar (~$7.62) byDaily Safe Output Tool Optimizer. Just two workflows account for 58% of the day's spend.Top cost drivers (24h)
Note: Copilot-engine workflows report
$0.00because cost is not surfaced for them in the run summary; raw token usage is the better health signal for those.Per-run overview (all 30 runs in the audit's most-recent batch)
DIFC integrity-filtered issue reads (5 events)
The agent attempted to read issues from non-approved authors, which were correctly blocked by integrity filtering. Behaves as designed — no action required.
#19099byabillingsley#31241bytrask#30832byjbaruch#30840byrhardouin#28888byclementbolinReason in every case:
Resource has lower integrity than agent requires. The agent cannot read data with integrity below "approved".Recommendations
schema-consistency-checker.lock.ymlwithgh aw compile. Highest-priority concrete fix — until landed, every scheduled run will continue to fail at activation.Daily Project Performance Summary Generatoregress. Decide whether the Google-domain and Node.js fetches are intentional (allowlist them) or accidental (remove the upstream call). 17% block rate is the worst in the fleet.Daily Cache Strategy Analyzerto the GitHub MCP server for any direct API/web access — direct calls toapi.github.com/github.comare being blocked and the MCP path is the supported one.Test Quality Sentinelprompt — turn count fluctuating from 4 to 28 is a stability signal worth investigating.[aw] Failure Investigator (6h)andDaily Safe Output Tool Optimizerjustify their cost — together they consume 58% of the day's Claude spend. Either trim their context window or move them to lower-cost models if appropriate.Notes on data coverage
Historical pre-2026-05-10 logs were not retrievable from the MCP
logstool (pagination returned empty beyond the day's window), so the trend chart is intra-day hourly rather than 30-day. Repo memory has now been seeded with today's metrics; future audits should be able to plot real day-over-day trends.References:
Beta Was this translation helpful? Give feedback.
All reactions