wrhalpin · wrhalpin · Apr 19, 2026 · Apr 19, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -19,6 +19,40 @@ all v1.4+ modules.
 → Full feature breakdown is in `## [1.4.0]` below; this entry marks the version cut.
 ## [Unreleased]
 
+### Added — Analysis rule engine for hypothesis evaluation
+
+New `gnat/analysis/rules/` package implementing a declarative Hy-based
+rule engine that evaluates `analysis.Hypothesis` objects and returns
+status transition decisions (OPEN → SUPPORTED, REFUTED, INCONCLUSIVE).
+
+- `AnalysisRuleEngine.evaluate()` — priority-sorted rule evaluation
+  with phase gates, transition-slot semantics, dirty-tree refusal in
+  production, exception logging (never halts). Returns decisions
+  without mutating state.
+- `RuleOrchestrator` — bridges engine to InvestigationService with
+  audit-first pattern. Writes all firing records before applying the
+  primary decision.
+- `AuditWriter` — rule_firing_audit table (alembic 0010) with git SHA
+  capture, in-memory fallback when SQLAlchemy unavailable.
+- `EvidenceResolver` — batch-resolves STIX IDs to source_platform and
+  TRUST_LEVEL via WorkspaceStore + CLIENT_REGISTRY. Per-eval cache.
+- `RuleEnginePolicy` — 7-field config from INI `[rules]` section with
+  `GNAT_ALLOW_DIRTY_RULES=1` env override. Feature flag default OFF.
+- 26 helper functions in 6 modules: evidence (counts, ratio),
+  confidence (Admiralty Scale, band), temporal (staleness, freshness),
+  status, policy (AI-60 ceiling), source (trust levels, AI-only).
+- `defrule` Hy macro + `set-status`/`annotate`/`no-op` constructors.
+- Hy helper surface re-exporting all Python helpers with Lisp naming.
+- `RuleLoader` with directory walking, priority sort, stat-on-call
+  hot reload, graceful fallback when Hy not installed.
+- `RuleEngineProtocol` (typing.Protocol) + `create_engine()` factory.
+- `IS_AI_CONNECTOR = True` added to ChatGPT and Copilot connectors.
+- 3 production reference rules + 4 examples in `rules/` directory.
+- ADR-0054 documenting all architectural decisions.
+- 4 Diataxis docs: tutorial, how-to, spec, explanation.
+- pyproject.toml `[rules]` extras group (`hy>=1.0`).
+- 123 tests, all passing.
+
 ### Added — Cuckoo Sandbox / CAPEv2 connector
 
 New `gnat/connectors/cuckoo/` connector for dynamic malware analysis.

diff --git a/CLAUDE.md b/CLAUDE.md
@@ -47,6 +47,7 @@ gnat/                        # Main Python package
 ├── codegen/                 # OpenAPI → connector scaffolding
 ├── async_client/            # Async variant (httpx)
 ├── analysis/attribution/    # Campaign tracking, Diamond Model, kill-chain, infrastructure classification, attribution hypotheses
+├── analysis/rules/          # Hy-based rule engine for hypothesis evaluation (26 helpers, audit trail, feature flag OFF)
 ├── plugins/huntgnat/        # STIX → detection rule translation (Sigma, YARA, Suricata, Snort), hunt packages, ATT&CK coverage, validation
 ├── serve/                   # FastAPI HTTP server + TAXII endpoint
 ├── search/                  # Solr full-text search sidecar (SearchMixin, indexer, ORM integration)
@@ -445,6 +446,7 @@ pip install "gnat[tui]"               # Terminal UI (textual)
 pip install "gnat[nlp]"               # NLP/IOC extraction (builtin; claude backend reuses agents)
 pip install "gnat[stix-validate]"      # Full STIX pattern validation (stix2-patterns ANTLR grammar)
 pip install "gnat[telemetry]"         # High-volume sensor ingestion (kafka-python-ng + redis)
+pip install "gnat[rules]"            # Hy-based analysis rule engine for hypothesis evaluation
 pip install "gnat[analysis]"          # Attribution & campaign tracking persistence (sqlalchemy)
 pip install "gnat[fast]"              # Rust native extension (install gnat-core wheel separately)
 pip install "gnat[all]"               # Everything

diff --git a/README.md b/README.md
@@ -67,6 +67,7 @@ GNAT provides a single, consistent abstraction layer over 158 platforms — thre
 | **Investigation Builder** | Five-step cross-platform evidence graph pipeline (seed → incident expansion → normalise → correlate → materialise) from any combination of connected platforms |
 | **HuntGNAT** | STIX pattern → detection rule translation (Sigma, YARA, Suricata, Snort), hunt packages with lifecycle management, ATT&CK coverage matrix, deployment tracking with drift detection, validation scoring |
 | **Telemetry Ingestion** | High-volume sensor ingestion from Kafka topics (honeypot, netflow, IDS alert, DNS log); Redis-backed deduplication; automatic campaign linking of ingested indicators |
+| **Rule Engine** | Hy/Lisp declarative rule engine for automated hypothesis evaluation; 26 analyst-authorable helper predicates; priority-based first-match with audit trail; AI-60 confidence ceiling; feature flag OFF by default |
 | **Intelligence Reports** | Structured finished-intelligence products with five-state lifecycle (DRAFT → PUBLISHED), STIX 2.1 SDO export on publish, versioning, and attribution |
 | **Dissemination** | ExportService (STIX/JSON/PDF), webhook fan-out with HMAC signing, TAXII 2.1 server, bearer-token REST gateway |
 | **Solr Search** | Full-text search sidecar; auto-indexes on write; Grafana SimpleJSON dashboards |
@@ -365,6 +366,7 @@ pip install "gnat[tui]"                # Interactive terminal UI (textual)
 pip install "gnat[nlp]"                # NLP query engine (zero deps for builtin; Claude backend requires [agents])
 pip install "gnat[stix-validate]"      # Tier-2 STIX pattern validation (stix2-patterns / ANTLR)
 pip install "gnat[telemetry]"          # High-volume sensor ingestion (kafka-python-ng + redis)
+pip install "gnat[rules]"             # Analysis rule engine for hypothesis evaluation (hy)
 pip install "gnat[analysis]"           # Attribution & campaign tracking (sqlalchemy)
 pip install "gnat[fast]"               # Rust IOC hot-path extension (maturin wheel)
 pip install "gnat[all]"                # Core extras (yaml, taxii, ingest, async, persist, schedule, reports, viz, serve)

diff --git a/docs/explanation/rule-engine.md b/docs/explanation/rule-engine.md
@@ -0,0 +1,94 @@
+# Rule Engine — Design Explanation
+
+## Why a rule engine
+
+`InvestigationService.update_hypothesis_status` is a pure setter — it
+changes state but doesn't evaluate whether the change is warranted.
+Today, hypothesis transitions happen manually. The `reasoning.HypothesisEngine`
+has hardcoded thresholds, but it operates on `STIXHypothesis` (STIX-level
+objects), not on `analysis.Hypothesis` (workspace-level objects).
+
+The rule engine fills the gap: automated, auditable, analyst-authorable
+hypothesis evaluation at the analysis layer.
+
+## Why Hy (Lisp-on-Python)
+
+Rules need to be:
+- **Declarative** — analysts read the rule and understand what it does
+- **In-process** — no separate runtime, no marshaling across boundaries
+- **Composable** — helper predicates combine naturally
+
+Hy compiles to Python AST and runs in the same interpreter. It's more
+declarative than Python (S-expressions force a functional style) but
+less foreign than Prolog (no separate process, no unification). YAML
+was considered but rejected because expressions-in-YAML becomes its
+own interpreter; a YAML engine may be added later via the Protocol.
+
+## Two engines coexist
+
+The `reasoning.HypothesisEngine` (ADR-0042) and the analysis
+`AnalysisRuleEngine` (ADR-0054) are not duplicates. They operate
+on different objects at different layers:
+
+| | HypothesisEngine | AnalysisRuleEngine |
+|---|---|---|
+| Object type | `STIXHypothesis` | `analysis.Hypothesis` |
+| Layer | STIX-level reasoning | Analyst workspace |
+| Thresholds | Hardcoded in Python | Declared in `.hy` rule files |
+| Authorship | GNAT maintainer | CTI analysts |
+
+Neither modifies the other. They will eventually feed into each other
+(STIX-level engine proposes, analysis-level engine evaluates), but
+that integration is post-v1.
+
+## Advisor pattern
+
+Rules return decisions — they never mutate state directly. The
+`RuleOrchestrator` reads the engine's `RuleEvaluationResult` and
+applies the primary decision via `InvestigationService`. This keeps
+state machine authority in one place and makes the engine testable
+with no service dependency.
+
+## Evidence resolution
+
+`Hypothesis.supporting_evidence` and `refuting_evidence` are lists of
+STIX IDs. To answer "is this evidence from a trusted source?", the
+engine uses `EvidenceResolver`, which:
+
+1. Batch-queries `WorkspaceStore.get_source_platforms_bulk()`
+2. Looks up the connector class from `CLIENT_REGISTRY`
+3. Reads `TRUST_LEVEL` from the class
+4. Caches results for the evaluation's lifetime
+
+STIX objects are never modified with connector metadata. The resolver
+is a lookup layer, not a mutation layer.
+
+## AI confidence ceiling
+
+The policy that AI-generated confidence cannot exceed 60 is enforced
+as a predicate (`within-ai-ceiling?`), not a clamp. Rules call it in
+their `:when` clause — if the ceiling is violated, the rule refuses
+to promote. The invariant is visible in rule source code, not hidden
+in a mutation pipeline.
+
+## Audit trail
+
+Every rule evaluation writes an audit record **before** applying the
+decision. The record captures: rule name, source file path, git SHA,
+decision JSON, and a boolean `applied` flag. If mutation fails, the
+error is recorded but the audit row already exists. This ensures
+complete traceability even for failed transitions.
+
+## Dirty-tree policy
+
+In production, rule files with uncommitted changes will not fire. The
+engine checks `git status --porcelain` for each rule's source file and
+captures `git log` SHA at firing time. This protects the audit trail:
+every fired rule can be traced to an exact committed version.
+
+`GNAT_ALLOW_DIRTY_RULES=1` overrides this for development.
+
+→ [ADR-0054: Analysis Rule Engine](architecture/adrs/0054-ADR-analysis-rule-engine.md)
+→ [Rule Engine Spec](../reference/rule-engine-spec.md)
+→ [Authoring Rules](../how-to/authoring-rules.md)
+→ [Your First Rule](../tutorials/your-first-rule.md)
diff --git a/docs/how-to/authoring-rules.md b/docs/how-to/authoring-rules.md
@@ -0,0 +1,122 @@
+# Authoring Rules
+
+Recipes for common hypothesis evaluation rule patterns.
+
+## Pattern 1: Evidence-threshold promotion
+
+Promote OPEN → SUPPORTED when evidence count meets a threshold:
+
+```hy
+(defrule promote-on-evidence
+  :phase "open"
+  :priority 100
+  :when (fn [h ctx]
+          (and (>= (supporting-count h) 3)
+               (not (has-refutation? h))))
+  :then (fn [h ctx]
+          (set-status "supported" :reason "3+ supporting, no refutation")))
+```
+
+## Pattern 2: Refutation
+
+Mark REFUTED when refuting evidence dominates:
+
+```hy
+(defrule refute-dominant
+  :phase "open"
+  :priority 90
+  :when (fn [h ctx]
+          (and (>= (refuting-count h) 2)
+               (> (refuting-count h) (supporting-count h))))
+  :then (fn [h ctx]
+          (set-status "refuted" :reason "Refuting evidence exceeds supporting")))
+```
+
+## Pattern 3: Blocking with no-op
+
+Prevent lower-priority rules from firing by consuming the transition slot:
+
+```hy
+(defrule block-low-confidence
+  :phase "open"
+  :priority 110
+  :when (fn [h ctx]
+          (and (has-confidence? h)
+               (not (reliability-at-least? h "C"))))
+  :then (fn [h ctx]
+          (no-op :reason "Reliability below C — blocking promotion")))
+```
+
+## Pattern 4: AI confidence ceiling
+
+Block promotion when all evidence is AI-sourced and confidence exceeds 60:
+
+```hy
+(defrule ai-ceiling-guard
+  :phase "open"
+  :priority 150
+  :when (fn [h ctx]
+          (and (ai-only? h ctx)
+               (not (within-ai-ceiling? h ctx))))
+  :then (fn [h ctx]
+          (no-op :reason "AI-only evidence exceeds confidence ceiling")))
+```
+
+## Pattern 5: Staleness timeout
+
+Mark hypotheses INCONCLUSIVE after extended inactivity:
+
+```hy
+(defrule stale-timeout
+  :phase "open"
+  :priority 20
+  :when (fn [h ctx]
+          (stale? h 90))
+  :then (fn [h ctx]
+          (set-status "inconclusive"
+                      :reason "No updates in 90+ days")))
+```
+
+## Pattern 6: Source trust gate
+
+Only promote when evidence includes at least one trusted_internal source:
+
+```hy
+(defrule require-trusted-source
+  :phase "open"
+  :priority 105
+  :when (fn [h ctx]
+          (and (>= (supporting-count h) 3)
+               (not (has-trusted-evidence? h ctx))))
+  :then (fn [h ctx]
+          (no-op :reason "No trusted_internal evidence — cannot promote")))
+```
+
+## Pattern 7: Annotation (non-blocking)
+
+Add metadata without affecting the transition slot:
+
+```hy
+(defrule flag-weak-evidence
+  :phase "open"
+  :priority 10
+  :when (fn [h ctx]
+          (< (supporting-count h) 2))
+  :then (fn [h ctx]
+          (annotate "needs-evidence" True
+                    :reason "Fewer than 2 supporting items")))
+```
+
+## Priority guidelines
+
+| Range | Use |
+|-------|-----|
+| 200+ | Analyst overrides |
+| 100–199 | Production promotion/refutation rules |
+| 50–99 | Secondary rules (guards, gates) |
+| 1–49 | Annotations, informational |
+
+## Helper reference
+
+Run `gnat rules list-helpers` for the complete helper catalog, or see
+[Rule Engine Spec](../reference/rule-engine-spec.md).