chore: harden codebase for AI coding agents (Tier 1 sensors) by harrymunro · Pull Request #123 · Aspegio/nelson

harrymunro · 2026-05-28T13:45:47Z

Summary

Install Ruff with AI-targeted thresholds in a new pyproject.toml so every lint message becomes self-correction feedback for an agent.
Add .pre-commit-config.yaml (Gitleaks + Ruff + standard hygiene) and a new CI job (python-lint + secret-scan) so the same checks fire locally, on commit, and in CI.
Document the sensors in CLAUDE.md and AGENTS.md — including the suppress-with-reason convention and the brownfield refactor backlog — so future agent invocations know to run them and how to interpret violations.

Based on the sensors-as-prompts and suppressions-as-review-anchors ideas from Birgitta Böckeler, Maintainability sensors for coding agents (martinfowler.com), via the harden-for-agents skill.

Note on scope

This PR also lands the merge of PR #120 (nelson_data_patterns.py learned-standing-orders pipeline, ~2,400 LOC). That feature is not part of the Tier 1 sensors work — it merged into this branch via 9113af9 before the hardening commit and rides along to main. The Tier 1 sensors changes themselves are confined to pyproject.toml, .pre-commit-config.yaml, .github/workflows/ci.yml, CLAUDE.md, AGENTS.md, docs/project_structure.md, plus the auto-fixes and reasoned noqas the new ruff config required.

Adversarial-review remediation (commits `361a506`..`82ebd74`)

A five-agent adversarial review surfaced three CRITICAL issues plus a long tail of HIGH/MEDIUM follow-ups. Applied in five CI-safe commits:

fix: restore LICENSE and README ownership — Revert LICENSE copyright to "2025 Harry Munro" (matches main); fix the stale aspegio/nelson marketplace path in README quick-start to harrymunro/nelson.
chore: tighten lint config and triage surfaced violations — Drop the bogus PLR0913 global ignore (the previous "covered by PLR0917" comment was wrong — PLR0917 is preview-only). The rule now fires; test/conftest helpers covered via per-file ignore, four production sites get reasoned per-function noqas pointing at the nelson-e6j refactor backlog. Also: anchor test glob to **/test_*.py, drop S108 from test ignore (replaced with one per-line noqa for a single placeholder), tighten the S603 rationale, delete the decorative [tool.mypy] block (roll-out tracked under nelson-8q8), and delete the test_regex*.py scratch files at the repo root (nelson-6bw).
chore: suppression hygiene + safer subprocess test pattern — Add a reason to the bare # type: ignore on hooks/nelson_hooks.py:653 (matches the documented shape on line 625). Refactor the _write_json crash/cleanup tests in test_nelson_data_memory.py to pass paths via env= injection rather than interpolating fixture values into a python -c string — the S603 suppression rationale becomes minimal and accurate.
ci: SHA-pin actions and pre-commit, add parity job, parallel tests — SHA-pin every third-party uses: (tag preserved as trailing comment); pre-commit autoupdate --freeze for the three hook repos; new pre-commit CI job using pre-commit/action@<sha> to close local↔CI parity for trailing-whitespace, end-of-file-fixer, check-{yaml,toml,json}, check-added-large-files, check-merge-conflict, ruff, and gitleaks; pin GITLEAKS_VERSION: 8.30.0 in the secret-scan job to match the pre-commit engine; cache: pip + cache-dependency-path; concurrency.cancel-in-progress scoped to PRs only; test job becomes a matrix of [skills/nelson/scripts, hooks, scripts] with fail-fast: false.
docs: soften brownfield wording, trim AGENTS.md, document sensors — Rewrite the Brownfield complexity backlog paragraph in CLAUDE.md so it no longer creates a perverse incentive to avoid flagged functions; collapse the duplicated Maintainability sensors section in AGENTS.md to a single pointer at CLAUDE.md; add pyproject.toml, .pre-commit-config.yaml, and .github/workflows/ci.yml to docs/project_structure.md; add a Local development section to CONTRIBUTING.md.

What's in the lint config

Setting	Value	Why
`max-args`	5	Agents over-stuff signatures; dataclasses/typed dicts are usually the answer
`max-branches`	10	Caps inflation of cmd_* / main() handlers
`max-statements`	50	Same
`max-complexity`	10	Mccabe gate
`max-returns`	6
`line-length`	120	Brownfield baseline; codebase comfortable at 110-120, p75 confirms
Per-dir ignores	`scripts/`, `hooks/`, `*/test_.py`, `conftest.py`	Each layer has legitimate uses of subprocess, asserts, late imports, print()

Triage outcome (brownfield)

Started at 183 violations after enabling the rules; ended at 0:

Auto-fixed (mechanical): import order, datetime UTC, unused imports.
Real bugs fixed: missing check= and strict= kwargs, unused variables, redefined loop variables, three lazy import json calls moved to module top.
Suppressed with reason on the largest functions (# noqa: C901, PLR0912, PLR0915 -- ... refactor tracked in nelson-e6j). Brownfield-safe — the rules still fire if anything gets worse, and the suppressions point at a refactor backlog rather than disappearing into noise.
Disabled globally with rationale: SIM108 (if/else→ternary often degrades readability), RUF002 (Unicode in docstrings is intentional prose).
Format: one-time ruff format baseline applied across 22 files.

What's deferred (filed as follow-up issues)

mypy strict mode (nelson-8q8) — ≈82% of non-test functions annotated (178/217 across 12 files; scripts/count-tokens.py has 0/6), but strict mode would drown out the linter signal. Roll out incrementally.
Tier 2 (nelson-7rm) — coverage gate, import-linter contracts, Semgrep SAST.
Tier 3 — mutation testing, scheduled inferential AI reviews.
Unify pytest config (nelson-y2l) — each test dir has its own conftest.py with overlapping fixture names; CI runs them as a matrix for now.
Narrow PLR0915/PLR0912 blanket ignore in test files — surface and triage ~5-10 large-test refactor candidates (new follow-up to file post-merge).
Replace gitleaks-action@v2 with the direct gitleaks binary — closes the fork-PR fail-open scenario where GITLEAKS_LICENSE is missing (new follow-up to file post-merge).

Test plan

ruff check — All checks passed
ruff format --check — 25 files already formatted
pytest skills/nelson/scripts/ — 353 passed
pytest hooks/ — 64 passed
pytest scripts/ — 21 passed
pre-commit run --all-files — all hooks pass
ruff check --select PLR0913 . — exits 0 (sensor actually fires now)
CI green on push (new python-lint, pre-commit, and matrix test jobs alongside existing secret-scan, markdown, yaml, links, spelling, references)
After merge: pre-commit install once locally so hooks fire on every commit

Install AI-targeted sensors so agent edits stay maintainable as the codebase evolves. Each violation message becomes feedback the agent reads and self-corrects against; suppress-with-reason and threshold-bump become reviewable anchors. Tier 1 sensors installed: - pyproject.toml: Ruff with AI-targeted thresholds (max-args=5, max-branches=10, max-statements=50, max-complexity=10, line-length=120 to match the codebase's actual long-tail) and rulesets E F W I B C90 PL RUF S SIM T20 UP. Per-directory ignores for scripts/, hooks/, tests/, and conftest.py reflecting how each layer legitimately uses subprocess, asserts, late imports, etc. - .pre-commit-config.yaml: Gitleaks (secret scanning), Ruff lint+format, plus standard hygiene (trailing whitespace, EOF, merge conflict, large file). Install with: pre-commit install. - .github/workflows/ci.yml: new python-lint and secret-scan jobs alongside the existing markdown/yaml/typos/tests/references checks. - CLAUDE.md / AGENTS.md: maintainability-sensors section documenting the sensors, the suppress-with-reason convention, and the brownfield complexity backlog. Brownfield triage outcome (started at 183 violations, ended at 0): - Auto-fixed imports, datetime UTC, unused imports. - Manually fixed real bugs: unused vars, missing check= and strict= kwargs, redefined loop names, lazy json imports moved to module top. - Suppressed complexity violations on the largest functions with inline noqa+reason pointing to a follow-up refactor backlog (rather than bulk-refactoring, which the source article warns produces over-engineering). - Disabled SIM108 (if/else->ternary) and RUF002 (Unicode in docstrings) globally with comments explaining why; they generate noise more often than signal in this codebase. - Applied ruff format across the repo as a one-time baseline. All 438 tests still pass; ruff lint + format clean. mypy strict mode and Tier 2/3 sensors (coverage gate, import-linter, SAST, mutation testing, periodic AI reviews) are deferred to follow-up tasks so each tier can stabilise before the next is added.

- Autoupdate to pre-commit-hooks v6.0.0, ruff-pre-commit v0.15.14, gitleaks v8.30.0 (caught on first `pre-commit run --all-files`). - end-of-file-fixer added missing trailing newlines to `.slim/cartography.json` and `demos/battleships/styles/main.css`.

Revert LICENSE copyright to "2025 Harry Munro" (matches main) and fix the stale "aspegio/nelson" marketplace path in README quick-start to "harrymunro/nelson" so every URL/badge in the doc agrees.

Drop PLR0913 from the global ignore (the previous "covered by PLR0917" note was wrong — PLR0917 is preview-only and never active). PLR0913 now fires, with two paths: - Test/conftest fixture helpers — per-file ignore (legitimate fixture surface; not a code-quality signal). - Production sites (4 functions in nelson_data_lifecycle.py and nelson_data_patterns.py) — per-function `# noqa: PLR0913 -- …` pointing at the relevant schema/CLI shape and backlogging the dataclass refactor under nelson-e6j. Other lint config corrections: - Anchor test glob to `**/test_*.py` (worked by accident before). - Drop S108 from test ignore — only one real placeholder string remains, and it's documented with a per-line noqa. - Add S603 callout that the suppression does NOT license `python -c` f-string interpolation of fixture-derived values (the C3 refactor). - Delete the `test_regex*.py` per-file ignore and the two scratch files at the repo root (covered nelson-6bw cleanup). - Drop the decorative `[tool.mypy]` block; the strict-mode roll-out lives under nelson-8q8.

Two small hardening fixes: 1. hooks/nelson_hooks.py:653 — the bare `# type: ignore[import-not-found]` inside the parens of the second `from nelson_circuit_breakers import …` block had no `--` reason. Move it onto the `from` line and merge with the existing PLC0415 noqa so it follows the same documented shape as the earlier import on line 625. 2. skills/nelson/scripts/test_nelson_data_memory.py — the `_write_json` crash/cleanup tests were building `python -c` bodies via f-string interpolation of `tmp_path`-derived values. That's exactly the pattern the S603 per-file-ignore in pyproject.toml now explicitly warns against. Switch to a single literal probe body that reads `SCRIPT_DIR` and `TARGET` from `os.environ`. The subprocess argv becomes fully literal — the S603 suppression rationale becomes minimal and accurate.

Supply-chain hardening + local↔CI parity for the CI workflow: - SHA-pin every third-party action (`actions/checkout`, `setup-python`, `gitleaks-action`, `markdownlint-cli2-action`, `action-yamllint`, `lychee-action`, `typos`) with the original tag preserved as a trailing comment for human readability and bot upgrade pathways. - `pre-commit autoupdate --freeze` does the equivalent for the three repos in `.pre-commit-config.yaml`; ruff happens to bump v0.15.14 -> v0.15.15 as a side effect (clean). - New `pre-commit` job runs `pre-commit/action@<sha>` so every hook used locally also runs in CI — closes drift for trailing-whitespace, end-of-file-fixer, check-{yaml,toml,json}, check-added-large-files, check-merge-conflict, ruff, and gitleaks. - `secret-scan` job now pins `GITLEAKS_VERSION: 8.30.0` to match the pre-commit engine. - `concurrency.cancel-in-progress` is now scoped to PRs only — pushes to main always run to completion. - `cache: pip` + `cache-dependency-path` on python jobs. - `test` job becomes a matrix of `[skills/nelson/scripts, hooks, scripts]` with `fail-fast: false` so a failure in one suite doesn't hide failures in the others.

- CLAUDE.md → *Brownfield complexity backlog*: rewrite the touch-the- function-and-refactor-or-keep-the-noqa wording so it removes the perverse incentive to avoid flagged functions. The refactor stays opportunistic; the noqa stays in place if the edit doesn't reduce complexity. - AGENTS.md → *Maintainability sensors*: collapse the duplicated bullet list to a single pointer at CLAUDE.md so the two docs can't drift apart again. - docs/project_structure.md: add `pyproject.toml`, `.pre-commit-config.yaml`, and `.github/workflows/ci.yml` to the repo-root entries — these are the files CLAUDE.md tells the agent to read. - CONTRIBUTING.md: add a *Local development* section with the pre-commit install command, the ruff invocations, and the three pytest commands, plus a link back to the sensor reference.

Pre-existing markdownlint MD031 violation on the fenced code block inside the *Tests* bullet of the Maintainability sensors section. Failed on every CI run since the Tier 1 sensors landed (27460f9) but went unnoticed because the markdown job was already red on other issues for unrelated branches. Adding a blank line above and below the fence brings markdownlint to zero errors across all 61 tracked .md files.

harrymunro added 8 commits May 28, 2026 10:45

fix: restore LICENSE and README ownership

361a506

Revert LICENSE copyright to "2025 Harry Munro" (matches main) and fix the stale "aspegio/nelson" marketplace path in README quick-start to "harrymunro/nelson" so every URL/badge in the doc agrees.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: harden codebase for AI coding agents (Tier 1 sensors)#123

chore: harden codebase for AI coding agents (Tier 1 sensors)#123
harrymunro wants to merge 8 commits into
mainfrom
chore/harden-for-agents

harrymunro commented May 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

harrymunro commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Note on scope

Adversarial-review remediation (commits 361a506..82ebd74)

What's in the lint config

Triage outcome (brownfield)

What's deferred (filed as follow-up issues)

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

harrymunro commented May 28, 2026 •

edited

Loading

Adversarial-review remediation (commits `361a506`..`82ebd74`)