Skip to content

Commit 3368afa

Browse files
authored
Merge pull request #129 from wrhalpin/claude/create-gnat-admin-guide-BOSrp
Add analysis rule engine foundation (Phase 1-2: ADR, resolver, decisi…
2 parents 810767a + 59aa66b commit 3368afa

30 files changed

Lines changed: 1410 additions & 0 deletions
Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
# ADR-0054: Analysis Rule Engine
2+
3+
**Decision:** Implement a declarative rule engine at `gnat/analysis/rules/`
4+
that evaluates `analysis.investigations.Hypothesis` objects and returns
5+
status transition decisions. Rules are authored as `.hy` (Hy/Lisp) files,
6+
loaded dynamically, and evaluated on hypothesis mutation. The engine is an
7+
advisor — it returns decisions but never mutates state directly.
8+
9+
**Problem statement:**
10+
`InvestigationService.update_hypothesis_status` is a pure setter with no
11+
evaluation logic. Status transitions happen manually. The `reasoning.HypothesisEngine`
12+
has hardcoded thresholds at the STIX level but operates on `STIXHypothesis`,
13+
not `analysis.Hypothesis`. There is an empty slot at the analysis layer for
14+
automated, auditable, analyst-authorable evaluation logic.
15+
16+
## Why Hy
17+
18+
Hy is a Lisp that compiles to Python AST and runs in the same interpreter.
19+
It sits between "more declarative than Python" and "less foreign than Prolog,"
20+
embedded in-process with no new service boundary.
21+
22+
**Alternatives considered:**
23+
- **Prolog:** Strong for pure inference but requires a separate runtime.
24+
Marshaling STIX objects across the boundary breaks the
25+
Postgres-as-source-of-truth contract.
26+
- **Clojure via Babashka:** Same cross-boundary cost as Prolog.
27+
- **YAML + DSL:** Analyst-familiar but YAML-with-expressions becomes
28+
its own interpreter. May be added as a second engine post-v1.
29+
- **Pure Python functions:** Works but loses the declarative-authoring
30+
property that is the engine's main value.
31+
32+
## Key Decisions
33+
34+
### Rules are advisors, not mutators
35+
36+
The engine's `evaluate()` returns a `RuleEvaluationResult` containing
37+
decisions. It does not mutate state. An orchestrator reads the decision
38+
and applies it via `InvestigationService.update_hypothesis_status`. This
39+
keeps the state machine authority in one place and makes the engine
40+
testable in isolation.
41+
42+
### Two-engine coexistence
43+
44+
`reasoning.HypothesisEngine` (STIX-level, ADR-0042) remains untouched.
45+
The new `AnalysisRuleEngine` operates on `analysis.investigations.Hypothesis`
46+
(analyst workspace level). These are different views of the same concept
47+
at different layers. They do not merge.
48+
49+
### Evidence resolution via dedicated resolver
50+
51+
`Hypothesis.supporting_evidence` and `refuting_evidence` are lists of
52+
STIX IDs. The engine resolves each ID to its originating connector via
53+
`EvidenceResolver`, which queries `WorkspaceStore.get_source_platforms_bulk`
54+
and looks up `TRUST_LEVEL` from `CLIENT_REGISTRY`. STIX objects are not
55+
polluted with connector metadata.
56+
57+
### Audit-first with applied flag
58+
59+
Every rule evaluation writes an audit record BEFORE applying the decision.
60+
The record has `applied: bool` that flips to true after successful mutation.
61+
No transaction threading — sequential operations with audit as leading write.
62+
63+
### AI-60 confidence ceiling as predicate, not clamp
64+
65+
The AI confidence ceiling is enforced as a helper predicate
66+
`within-ai-ceiling?` that rules call in their `:when` clause. Rules
67+
refuse to promote if the ceiling is violated. The ceiling is NOT a
68+
mutation that clamps the number — it stays visible in rule source code.
69+
70+
### Priority-based first-match semantics
71+
72+
Rules sorted by priority descending. First rule whose `:when` returns
73+
truthy for a status-transition decision fires and consumes the transition
74+
slot. Annotations always fire. `no_op` consumes the slot without mutating.
75+
76+
### Dirty-tree policy
77+
78+
In production, rules with uncommitted source file changes will not fire.
79+
Git SHA captured in audit records. `GNAT_ALLOW_DIRTY_RULES=1` provides
80+
emergency override.
81+
82+
### Feature flag default OFF
83+
84+
Existing users unaffected. Enable via `[rules] enabled = true` in config.
85+
86+
## Consequences
87+
88+
**Positive:** Analyst-authorable hypothesis evaluation, full audit trail,
89+
declarative expression, testable in isolation from service layer.
90+
91+
**Negative:** Hy dependency (optional extra), helper library maintenance,
92+
analyst learning curve for Lisp syntax.
93+
94+
**Neutral:** Second engine implementation (YAML, Python) possible later
95+
via `RuleEngineProtocol` without refactoring the core.
96+
97+
→ Related: ADR-0031 (Analysis Layer Architecture)
98+
→ Related: ADR-0033 (Confidence Scoring — Admiralty Scale)
99+
→ Related: ADR-0042 (Hypothesis Engine — STIX-level, coexists)

docs/explanation/architecture/adrs/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,7 @@ subsystems.
6666
51. [ADR-0051: Attribution & Campaign Tracking](0051-ADR-attribution-campaign-tracking.md)
6767
52. [ADR-0052: Telemetry Ingestion](0052-ADR-telemetry-ingestion.md)
6868
53. [ADR-0053: Infrastructure Graph Labels](0053-ADR-infrastructure-graph-labels.md)
69+
54. [ADR-0054: Analysis Rule Engine](0054-ADR-analysis-rule-engine.md)
6970

7071
---
7172

gnat/analysis/rules/__init__.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# SPDX-License-Identifier: Apache-2.0
2+
# Copyright 2026 Bill Halpin
3+
"""
4+
gnat.analysis.rules
5+
=======================
6+
7+
Declarative rule engine for hypothesis evaluation. Rules are authored
8+
as ``.hy`` files, loaded dynamically, and return status transition
9+
decisions without mutating state directly.
10+
11+
Install Hy dependency with ``pip install "gnat[rules]"``.
12+
"""

gnat/analysis/rules/context.py

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# SPDX-License-Identifier: Apache-2.0
2+
# Copyright 2026 Bill Halpin
3+
"""RuleContext — evaluation-scoped state passed to rule predicates."""
4+
5+
from __future__ import annotations
6+
7+
from dataclasses import dataclass
8+
from datetime import datetime
9+
10+
from gnat.analysis.rules.policy import RuleEnginePolicy
11+
from gnat.analysis.rules.resolver import EvidenceResolver
12+
13+
14+
@dataclass(frozen=True)
15+
class RuleContext:
16+
resolver: EvidenceResolver
17+
policy: RuleEnginePolicy
18+
now: datetime
19+
workspace_id: int
20+
engine_version: str = "1.0.0"

gnat/analysis/rules/decisions.py

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
# SPDX-License-Identifier: Apache-2.0
2+
# Copyright 2026 Bill Halpin
3+
"""Decision dataclasses returned by rules."""
4+
5+
from __future__ import annotations
6+
7+
from dataclasses import dataclass
8+
from datetime import datetime, timezone
9+
from enum import Enum
10+
from typing import Any
11+
12+
from gnat.analysis.investigations.models import HypothesisStatus
13+
14+
15+
class DecisionAction(str, Enum):
16+
SET_STATUS = "set_status"
17+
ANNOTATE = "annotate"
18+
NO_OP = "no_op"
19+
20+
21+
@dataclass(frozen=True)
22+
class Decision:
23+
action: DecisionAction
24+
reason: str
25+
timestamp: datetime
26+
27+
def should_mutate(self) -> bool:
28+
return self.action == DecisionAction.SET_STATUS
29+
30+
def consumes_transition_slot(self) -> bool:
31+
return self.action in (DecisionAction.SET_STATUS, DecisionAction.NO_OP)
32+
33+
34+
@dataclass(frozen=True)
35+
class SetStatusDecision(Decision):
36+
target_status: HypothesisStatus = HypothesisStatus.OPEN
37+
38+
39+
@dataclass(frozen=True)
40+
class AnnotateDecision(Decision):
41+
key: str = ""
42+
value: Any = None
43+
44+
45+
@dataclass(frozen=True)
46+
class NoOpDecision(Decision):
47+
pass
48+
49+
50+
def set_status(target: HypothesisStatus | str, reason: str = "") -> SetStatusDecision:
51+
if isinstance(target, str):
52+
target = HypothesisStatus(target)
53+
return SetStatusDecision(
54+
action=DecisionAction.SET_STATUS,
55+
reason=reason,
56+
timestamp=datetime.now(timezone.utc),
57+
target_status=target,
58+
)
59+
60+
61+
def annotate(key: str, value: Any, reason: str = "") -> AnnotateDecision:
62+
return AnnotateDecision(
63+
action=DecisionAction.ANNOTATE,
64+
reason=reason,
65+
timestamp=datetime.now(timezone.utc),
66+
key=key,
67+
value=value,
68+
)
69+
70+
71+
def no_op(reason: str = "") -> NoOpDecision:
72+
return NoOpDecision(
73+
action=DecisionAction.NO_OP,
74+
reason=reason,
75+
timestamp=datetime.now(timezone.utc),
76+
)
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# SPDX-License-Identifier: Apache-2.0
2+
# Copyright 2026 Bill Halpin
3+
"""Pure Python helper functions for rule predicates."""
Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
# SPDX-License-Identifier: Apache-2.0
2+
# Copyright 2026 Bill Halpin
3+
"""Confidence, reliability, and credibility helpers."""
4+
5+
from __future__ import annotations
6+
7+
from typing import Any
8+
9+
_RELIABILITY_ORDER = ["F", "E", "D", "C", "B", "A"]
10+
11+
12+
def has_confidence(h: Any) -> bool:
13+
"""True if the hypothesis has a ConfidenceScore assigned."""
14+
return getattr(h, "confidence", None) is not None
15+
16+
17+
def stix_confidence(h: Any) -> int:
18+
"""STIX confidence (0-100), or 0 if no confidence set."""
19+
conf = getattr(h, "confidence", None)
20+
if conf is None:
21+
return 0
22+
return getattr(conf, "stix_confidence", 0)
23+
24+
25+
def confidence_band(h: Any) -> str | None:
26+
"""Return the confidence level band (HIGH/MEDIUM/LOW) or None."""
27+
conf = getattr(h, "confidence", None)
28+
if conf is None:
29+
return None
30+
band = getattr(conf, "band", None)
31+
if band is None:
32+
return None
33+
return band.value if hasattr(band, "value") else str(band)
34+
35+
36+
def reliability_of(h: Any) -> str | None:
37+
"""Source reliability letter (A-F) or None."""
38+
conf = getattr(h, "confidence", None)
39+
if conf is None:
40+
return None
41+
sr = getattr(conf, "source_reliability", None)
42+
if sr is None:
43+
return None
44+
return sr.value if hasattr(sr, "value") else str(sr)
45+
46+
47+
def credibility_of(h: Any) -> int | None:
48+
"""Information credibility (1-6) or None."""
49+
conf = getattr(h, "confidence", None)
50+
if conf is None:
51+
return None
52+
ic = getattr(conf, "information_credibility", None)
53+
if ic is None:
54+
return None
55+
return ic.value if hasattr(ic, "value") else int(ic)
56+
57+
58+
def reliability_at_least(h: Any, level: str) -> bool:
59+
"""True if reliability meets or exceeds the given level."""
60+
actual = reliability_of(h)
61+
if actual is None:
62+
return False
63+
try:
64+
return _RELIABILITY_ORDER.index(actual) >= _RELIABILITY_ORDER.index(level)
65+
except ValueError:
66+
return False
67+
68+
69+
def credibility_at_least(h: Any, level: int) -> bool:
70+
"""True if credibility meets or exceeds the given level (lower is better)."""
71+
actual = credibility_of(h)
72+
if actual is None:
73+
return False
74+
return actual <= level
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# SPDX-License-Identifier: Apache-2.0
2+
# Copyright 2026 Bill Halpin
3+
"""Evidence count and ratio helpers."""
4+
5+
from __future__ import annotations
6+
7+
from typing import Any
8+
9+
10+
def supporting_count(h: Any) -> int:
11+
"""Number of supporting evidence items."""
12+
return len(getattr(h, "supporting_evidence", []) or [])
13+
14+
15+
def refuting_count(h: Any) -> int:
16+
"""Number of refuting evidence items."""
17+
return len(getattr(h, "refuting_evidence", []) or [])
18+
19+
20+
def evidence_count(h: Any) -> int:
21+
"""Total evidence items (supporting + refuting)."""
22+
return supporting_count(h) + refuting_count(h)
23+
24+
25+
def has_refutation(h: Any) -> bool:
26+
"""True if any refuting evidence exists."""
27+
return refuting_count(h) > 0
28+
29+
30+
def support_ratio(h: Any) -> float:
31+
"""Supporting / (total + 1). Smoothed to avoid division by zero."""
32+
total = evidence_count(h)
33+
return supporting_count(h) / (total + 1)
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# SPDX-License-Identifier: Apache-2.0
2+
# Copyright 2026 Bill Halpin
3+
"""AI confidence ceiling policy helper."""
4+
5+
from __future__ import annotations
6+
7+
from typing import Any
8+
9+
from gnat.analysis.rules.helpers.confidence import stix_confidence
10+
from gnat.analysis.rules.helpers.source import ai_only
11+
12+
13+
def within_ai_ceiling(h: Any, ctx: Any) -> bool:
14+
"""True if NOT ai-only, OR ai-only AND confidence <= ceiling."""
15+
if not ai_only(h, ctx):
16+
return True
17+
ceiling = getattr(ctx.policy, "ai_confidence_ceiling", 60)
18+
return stix_confidence(h) <= ceiling

0 commit comments

Comments
 (0)