Skip to content

[skill-optimizer] Daily Skill Optimizer Improvements - 2026-05-12 #31624

@github-actions

Description

@github-actions

Summary

  • Run mode: dry-run
  • Status: ⚠️ Skipped (no OPENROUTER_API_KEY configured — suite execution did not run)

Key Findings

  1. No benchmark suite was executed — the optimizer cannot generate meaningful pass-rate scores or improvement candidates without an API key. The dry-run path only validates tooling setup; all skill quality signals are absent.

    • Expected impact: Enabling benchmark mode would surface concrete pass-rate gaps per skill and drive data-driven improvements.
  2. SKILL.md surface description is too sparse for task generationSKILL.md (the benchmark target) is only ~40 lines and lacks representative edge-case examples. The skill-optimizer's taskGeneration (maxTasks: 20) relies on richly described surfaces; a thin skill file produces low-diversity, low-quality eval tasks.

    • Expected impact: Expanding SKILL.md with more concrete usage patterns and failure modes would increase eval diversity and improve benchmark reliability.
  3. allowedPaths is locked to ["SKILL.md"] — the optimizer's optimize.allowedPaths only permits editing SKILL.md. However, the majority of detailed guidance lives in skills/*/SKILL.md domain files. The optimizer can never improve those skill files in automated passes, limiting optimization scope to the thin top-level surface.

    • Expected impact: Widening allowedPaths (e.g., ["SKILL.md", "skills/*/SKILL.md"]) would let the optimizer iteratively improve the domain skill files that agents actually read most often.
Evidence from Artifact

summary.json

{
  "repository": "github/gh-aw",
  "run_mode": "dry-run",
  "run_status": 0,
  "run_url": "https://github.com/github/gh-aw/actions/runs/25712309353"
}

run.log

dry-run: Docker available but OPENROUTER_API_KEY not set; skipping suite execution

.skill-optimizer/skill-optimizer.json (relevant excerpt)

{
  "target": { "skill": "../SKILL.md" },
  "benchmark": {
    "taskGeneration": { "enabled": true, "maxTasks": 20 },
    "verdict": { "perModelFloor": 0.6, "targetWeightedAverage": 0.8 }
  },
  "optimize": {
    "allowedPaths": ["SKILL.md"],
    "maxIterations": 3
  }
}

SKILL.md is ~40 lines with high-level usage examples but no edge-case, failure-mode, or detailed workflow frontmatter patterns.

Recommendations

  1. Add OPENROUTER_API_KEY to repository Actions secrets so the daily workflow runs in benchmark mode instead of dry-run. This is the prerequisite for all other optimizer improvements; without it the tool produces no actionable data.

  2. Expand SKILL.md with richer content: add 5-10 representative frontmatter snippets (engines, MCP tool configs, safe-outputs, network restrictions), a short troubleshooting / common-error section, and at least one multi-step usage walkthrough. This gives the task-generation component material to produce diverse, realistic eval cases.

  3. Widen optimize.allowedPaths in .skill-optimizer/skill-optimizer.json from ["SKILL.md"] to include ["skills/*/SKILL.md"] (or individual high-traffic skill files such as skills/github-mcp-server/SKILL.md and skills/developer/SKILL.md). This lets the optimizer improve the domain-specific guidance that developers and agents rely on most.

Generated by Daily Skill Optimizer Improvements · ● 4M ·

  • expires on May 19, 2026, 3:58 AM UTC

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions