You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> Source of truth for which checks gate a release vs. which checks merely
4
+
> inform. The release pipeline (`.github/workflows/release.yml`) wires the
5
+
> mandatory checks; nightly workflows run the advisory ones separately.
6
+
7
+
## Principle
8
+
9
+
CI gates **regression**, not **quality**. A release pipeline must refuse to
10
+
publish anything that fails a deterministic correctness check, but it must
11
+
not block on slow / probabilistic / network-dependent quality benches —
12
+
those drift from green-to-red for reasons unrelated to the artifact under
13
+
release. Quality measurements live in nightly and weekly cadence.
14
+
15
+
## Mandatory checks (release MUST fail if any of these fail)
16
+
17
+
These run inside `release.yml` after the build job and before publish. They
18
+
are wired through `scripts/validate/pre_release_check.py` (mandatory subset)
19
+
plus `scripts/release-gate.sh --quick --pypi`. Failure stops the publish
20
+
step.
21
+
22
+
| Check | Where defined | Why mandatory |
23
+
|---|---|---|
24
+
| Version 4-file consistency |`pre_release_check.py:_check_version_consistency`| A mismatch ships a broken artifact (different version in wheel vs. plugin manifest). |
| Channel experience dirs initialized |`pre_release_check.py:_check_experience_dirs`| Missing dirs cause silent runtime failures the moment a user invokes a channel. |
27
+
| MCP tools registered (10 v2 contract tools) |`pre_release_check.py:_check_mcp_tools`| A fresh install where the MCP layer is missing tools is unusable. |
28
+
| Open PR release blockers (label-gated) |`pre_release_check.py:_check_open_prs`| A release while a `release-blocker` PR is open ships a known-broken artifact. |
29
+
| Git working tree clean |`pre_release_check.py:_check_git_clean`| Releasing with uncommitted changes means the published artifact does not match any commit. |
30
+
| Local version uniqueness |`release-gate.sh --quick` (`check_version_uniqueness.py --mode=local`) | Tag collision destroys the release. |
31
+
| PyPI version uniqueness |`release-gate.sh --pypi` (`check_version_uniqueness.py --mode=pypi`) | PyPI rejects upload of an already-published version; release fails halfway. |
32
+
| Lint + format (ruff) |`release-gate.sh --quick`| Already enforced on every PR; serves as belt-and-braces here. |
33
+
| CLI surface smoke (`autosearch --help`, `mcp-check`, `doctor --json`) |`release-gate.sh --quick`| Catches packaging breakage that unit tests miss. |
34
+
35
+
## Advisory checks (release continues; failures are reported but not fatal)
36
+
37
+
These appear in `pre_release_check.py` output prefixed with `[WARN]
38
+
[advisory]` and are summarized in the `ADVISORY: N/M passed` line. They
39
+
surface signal but do not change the script's exit code; the release
40
+
pipeline keeps going.
41
+
42
+
| Check | Where defined | Why advisory |
43
+
|---|---|---|
44
+
| Gate 12 bench ≥ 50% (augment-vs-bare) |`pre_release_check.py:_check_gate12_bench`| Real-LLM bench, slow + probabilistic. A green bench costs ~$5 and 15 min; running it inside release.yml every patch release is wasteful. Drift between bench and HEAD is normal. Failures here flag *quality regression candidates* for human triage, not release blockers. |
45
+
46
+
## Nightly / out-of-band checks (NOT in release.yml; run on schedule)
47
+
48
+
These run in dedicated workflows. They do not block any PR or release; they
49
+
post results (or open an issue on failure) for engineering follow-up.
50
+
51
+
| Check | Workflow | Cadence | Why out of band |
52
+
|---|---|---|---|
53
+
| E2B matrix release gate |`.github/workflows/e2b-nightly.yml`| Daily 02:00 UTC | E2B sandbox runs cost ~$0.25 each and 5-10 min wall time. The matrix exercises real install + first-use across multiple scenarios. Catches install-path regressions that only show up in a clean OS image. Daily cadence is enough — main gets at most a few merges per day. |
54
+
| Cross-platform install (Windows / macOS) |`.github/workflows/cross-platform.yml`| Weekly Monday 03:00 UTC | Slow runners (~15 min) and rarely catches anything new. Weekly is enough for Tier-2 platforms. |
55
+
| Live integration tests (real APIs) |`.github/workflows/nightly.yml`| Daily 02:00 UTC | Hits external APIs (Anthropic, OpenAI, GitHub, etc.). Real spend, real rate limits — cannot be on every PR. |
56
+
57
+
## How to change this policy
58
+
59
+
1. Edit this file.
60
+
2. If a check moves from mandatory → advisory, move it from
61
+
`MANDATORY_CHECKS` to `ADVISORY_CHECKS` in
62
+
`scripts/validate/pre_release_check.py`. Mandatory failures set the exit
63
+
code to 1; advisory failures only emit a `[WARN] [advisory]` line.
64
+
Reverse direction: move it back into `MANDATORY_CHECKS`.
65
+
3. If a check moves into / out of `release.yml`, edit the workflow.
66
+
4. Open one PR with all three changes. Title: `policy(release): <what>`.
67
+
Reference this doc in the PR body.
68
+
69
+
## Audit trail
70
+
71
+
| Date | Change | Driver |
72
+
|---|---|---|
73
+
| 2026-04-26 | Initial version. Gate 12 → advisory. E2B matrix → nightly. | P0-4 from `autosearch-0425-p0-scan-report.md`. The release pipeline was bypassing `pre_release_check.py` entirely; this policy spells out exactly which subset must fire. |
0 commit comments