Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 34 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,40 @@ all v1.4+ modules.
→ Full feature breakdown is in `## [1.4.0]` below; this entry marks the version cut.
## [Unreleased]

### Added — Analysis rule engine for hypothesis evaluation

New `gnat/analysis/rules/` package implementing a declarative Hy-based
rule engine that evaluates `analysis.Hypothesis` objects and returns
status transition decisions (OPEN → SUPPORTED, REFUTED, INCONCLUSIVE).

- `AnalysisRuleEngine.evaluate()` — priority-sorted rule evaluation
with phase gates, transition-slot semantics, dirty-tree refusal in
production, exception logging (never halts). Returns decisions
without mutating state.
- `RuleOrchestrator` — bridges engine to InvestigationService with
audit-first pattern. Writes all firing records before applying the
primary decision.
- `AuditWriter` — rule_firing_audit table (alembic 0010) with git SHA
capture, in-memory fallback when SQLAlchemy unavailable.
- `EvidenceResolver` — batch-resolves STIX IDs to source_platform and
TRUST_LEVEL via WorkspaceStore + CLIENT_REGISTRY. Per-eval cache.
- `RuleEnginePolicy` — 7-field config from INI `[rules]` section with
`GNAT_ALLOW_DIRTY_RULES=1` env override. Feature flag default OFF.
- 26 helper functions in 6 modules: evidence (counts, ratio),
confidence (Admiralty Scale, band), temporal (staleness, freshness),
status, policy (AI-60 ceiling), source (trust levels, AI-only).
- `defrule` Hy macro + `set-status`/`annotate`/`no-op` constructors.
- Hy helper surface re-exporting all Python helpers with Lisp naming.
- `RuleLoader` with directory walking, priority sort, stat-on-call
hot reload, graceful fallback when Hy not installed.
- `RuleEngineProtocol` (typing.Protocol) + `create_engine()` factory.
- `IS_AI_CONNECTOR = True` added to ChatGPT and Copilot connectors.
- 3 production reference rules + 4 examples in `rules/` directory.
- ADR-0054 documenting all architectural decisions.
- 4 Diataxis docs: tutorial, how-to, spec, explanation.
- pyproject.toml `[rules]` extras group (`hy>=1.0`).
- 123 tests, all passing.

### Added — Cuckoo Sandbox / CAPEv2 connector

New `gnat/connectors/cuckoo/` connector for dynamic malware analysis.
Expand Down
2 changes: 2 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ gnat/ # Main Python package
├── codegen/ # OpenAPI → connector scaffolding
├── async_client/ # Async variant (httpx)
├── analysis/attribution/ # Campaign tracking, Diamond Model, kill-chain, infrastructure classification, attribution hypotheses
├── analysis/rules/ # Hy-based rule engine for hypothesis evaluation (26 helpers, audit trail, feature flag OFF)
├── plugins/huntgnat/ # STIX → detection rule translation (Sigma, YARA, Suricata, Snort), hunt packages, ATT&CK coverage, validation
├── serve/ # FastAPI HTTP server + TAXII endpoint
├── search/ # Solr full-text search sidecar (SearchMixin, indexer, ORM integration)
Expand Down Expand Up @@ -445,6 +446,7 @@ pip install "gnat[tui]" # Terminal UI (textual)
pip install "gnat[nlp]" # NLP/IOC extraction (builtin; claude backend reuses agents)
pip install "gnat[stix-validate]" # Full STIX pattern validation (stix2-patterns ANTLR grammar)
pip install "gnat[telemetry]" # High-volume sensor ingestion (kafka-python-ng + redis)
pip install "gnat[rules]" # Hy-based analysis rule engine for hypothesis evaluation
pip install "gnat[analysis]" # Attribution & campaign tracking persistence (sqlalchemy)
pip install "gnat[fast]" # Rust native extension (install gnat-core wheel separately)
pip install "gnat[all]" # Everything
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ GNAT provides a single, consistent abstraction layer over 158 platforms — thre
| **Investigation Builder** | Five-step cross-platform evidence graph pipeline (seed → incident expansion → normalise → correlate → materialise) from any combination of connected platforms |
| **HuntGNAT** | STIX pattern → detection rule translation (Sigma, YARA, Suricata, Snort), hunt packages with lifecycle management, ATT&CK coverage matrix, deployment tracking with drift detection, validation scoring |
| **Telemetry Ingestion** | High-volume sensor ingestion from Kafka topics (honeypot, netflow, IDS alert, DNS log); Redis-backed deduplication; automatic campaign linking of ingested indicators |
| **Rule Engine** | Hy/Lisp declarative rule engine for automated hypothesis evaluation; 26 analyst-authorable helper predicates; priority-based first-match with audit trail; AI-60 confidence ceiling; feature flag OFF by default |
| **Intelligence Reports** | Structured finished-intelligence products with five-state lifecycle (DRAFT → PUBLISHED), STIX 2.1 SDO export on publish, versioning, and attribution |
| **Dissemination** | ExportService (STIX/JSON/PDF), webhook fan-out with HMAC signing, TAXII 2.1 server, bearer-token REST gateway |
| **Solr Search** | Full-text search sidecar; auto-indexes on write; Grafana SimpleJSON dashboards |
Expand Down Expand Up @@ -365,6 +366,7 @@ pip install "gnat[tui]" # Interactive terminal UI (textual)
pip install "gnat[nlp]" # NLP query engine (zero deps for builtin; Claude backend requires [agents])
pip install "gnat[stix-validate]" # Tier-2 STIX pattern validation (stix2-patterns / ANTLR)
pip install "gnat[telemetry]" # High-volume sensor ingestion (kafka-python-ng + redis)
pip install "gnat[rules]" # Analysis rule engine for hypothesis evaluation (hy)
pip install "gnat[analysis]" # Attribution & campaign tracking (sqlalchemy)
pip install "gnat[fast]" # Rust IOC hot-path extension (maturin wheel)
pip install "gnat[all]" # Core extras (yaml, taxii, ingest, async, persist, schedule, reports, viz, serve)
Expand Down
94 changes: 94 additions & 0 deletions docs/explanation/rule-engine.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Rule Engine — Design Explanation

## Why a rule engine

`InvestigationService.update_hypothesis_status` is a pure setter — it
changes state but doesn't evaluate whether the change is warranted.
Today, hypothesis transitions happen manually. The `reasoning.HypothesisEngine`
has hardcoded thresholds, but it operates on `STIXHypothesis` (STIX-level
objects), not on `analysis.Hypothesis` (workspace-level objects).

The rule engine fills the gap: automated, auditable, analyst-authorable
hypothesis evaluation at the analysis layer.

## Why Hy (Lisp-on-Python)

Rules need to be:
- **Declarative** — analysts read the rule and understand what it does
- **In-process** — no separate runtime, no marshaling across boundaries
- **Composable** — helper predicates combine naturally

Hy compiles to Python AST and runs in the same interpreter. It's more
declarative than Python (S-expressions force a functional style) but
less foreign than Prolog (no separate process, no unification). YAML
was considered but rejected because expressions-in-YAML becomes its
own interpreter; a YAML engine may be added later via the Protocol.

## Two engines coexist

The `reasoning.HypothesisEngine` (ADR-0042) and the analysis
`AnalysisRuleEngine` (ADR-0054) are not duplicates. They operate
on different objects at different layers:

| | HypothesisEngine | AnalysisRuleEngine |
|---|---|---|
| Object type | `STIXHypothesis` | `analysis.Hypothesis` |
| Layer | STIX-level reasoning | Analyst workspace |
| Thresholds | Hardcoded in Python | Declared in `.hy` rule files |
| Authorship | GNAT maintainer | CTI analysts |

Neither modifies the other. They will eventually feed into each other
(STIX-level engine proposes, analysis-level engine evaluates), but
that integration is post-v1.

## Advisor pattern

Rules return decisions — they never mutate state directly. The
`RuleOrchestrator` reads the engine's `RuleEvaluationResult` and
applies the primary decision via `InvestigationService`. This keeps
state machine authority in one place and makes the engine testable
with no service dependency.

## Evidence resolution

`Hypothesis.supporting_evidence` and `refuting_evidence` are lists of
STIX IDs. To answer "is this evidence from a trusted source?", the
engine uses `EvidenceResolver`, which:

1. Batch-queries `WorkspaceStore.get_source_platforms_bulk()`
2. Looks up the connector class from `CLIENT_REGISTRY`
3. Reads `TRUST_LEVEL` from the class
4. Caches results for the evaluation's lifetime

STIX objects are never modified with connector metadata. The resolver
is a lookup layer, not a mutation layer.

## AI confidence ceiling

The policy that AI-generated confidence cannot exceed 60 is enforced
as a predicate (`within-ai-ceiling?`), not a clamp. Rules call it in
their `:when` clause — if the ceiling is violated, the rule refuses
to promote. The invariant is visible in rule source code, not hidden
in a mutation pipeline.

## Audit trail

Every rule evaluation writes an audit record **before** applying the
decision. The record captures: rule name, source file path, git SHA,
decision JSON, and a boolean `applied` flag. If mutation fails, the
error is recorded but the audit row already exists. This ensures
complete traceability even for failed transitions.

## Dirty-tree policy

In production, rule files with uncommitted changes will not fire. The
engine checks `git status --porcelain` for each rule's source file and
captures `git log` SHA at firing time. This protects the audit trail:
every fired rule can be traced to an exact committed version.

`GNAT_ALLOW_DIRTY_RULES=1` overrides this for development.

→ [ADR-0054: Analysis Rule Engine](architecture/adrs/0054-ADR-analysis-rule-engine.md)
→ [Rule Engine Spec](../reference/rule-engine-spec.md)
→ [Authoring Rules](../how-to/authoring-rules.md)
→ [Your First Rule](../tutorials/your-first-rule.md)
122 changes: 122 additions & 0 deletions docs/how-to/authoring-rules.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
# Authoring Rules

Recipes for common hypothesis evaluation rule patterns.

## Pattern 1: Evidence-threshold promotion

Promote OPEN → SUPPORTED when evidence count meets a threshold:

```hy
(defrule promote-on-evidence
:phase "open"
:priority 100
:when (fn [h ctx]
(and (>= (supporting-count h) 3)
(not (has-refutation? h))))
:then (fn [h ctx]
(set-status "supported" :reason "3+ supporting, no refutation")))
```

## Pattern 2: Refutation

Mark REFUTED when refuting evidence dominates:

```hy
(defrule refute-dominant
:phase "open"
:priority 90
:when (fn [h ctx]
(and (>= (refuting-count h) 2)
(> (refuting-count h) (supporting-count h))))
:then (fn [h ctx]
(set-status "refuted" :reason "Refuting evidence exceeds supporting")))
```

## Pattern 3: Blocking with no-op

Prevent lower-priority rules from firing by consuming the transition slot:

```hy
(defrule block-low-confidence
:phase "open"
:priority 110
:when (fn [h ctx]
(and (has-confidence? h)
(not (reliability-at-least? h "C"))))
:then (fn [h ctx]
(no-op :reason "Reliability below C — blocking promotion")))
```

## Pattern 4: AI confidence ceiling

Block promotion when all evidence is AI-sourced and confidence exceeds 60:

```hy
(defrule ai-ceiling-guard
:phase "open"
:priority 150
:when (fn [h ctx]
(and (ai-only? h ctx)
(not (within-ai-ceiling? h ctx))))
:then (fn [h ctx]
(no-op :reason "AI-only evidence exceeds confidence ceiling")))
```

## Pattern 5: Staleness timeout

Mark hypotheses INCONCLUSIVE after extended inactivity:

```hy
(defrule stale-timeout
:phase "open"
:priority 20
:when (fn [h ctx]
(stale? h 90))
:then (fn [h ctx]
(set-status "inconclusive"
:reason "No updates in 90+ days")))
```

## Pattern 6: Source trust gate

Only promote when evidence includes at least one trusted_internal source:

```hy
(defrule require-trusted-source
:phase "open"
:priority 105
:when (fn [h ctx]
(and (>= (supporting-count h) 3)
(not (has-trusted-evidence? h ctx))))
:then (fn [h ctx]
(no-op :reason "No trusted_internal evidence — cannot promote")))
```

## Pattern 7: Annotation (non-blocking)

Add metadata without affecting the transition slot:

```hy
(defrule flag-weak-evidence
:phase "open"
:priority 10
:when (fn [h ctx]
(< (supporting-count h) 2))
:then (fn [h ctx]
(annotate "needs-evidence" True
:reason "Fewer than 2 supporting items")))
```

## Priority guidelines

| Range | Use |
|-------|-----|
| 200+ | Analyst overrides |
| 100–199 | Production promotion/refutation rules |
| 50–99 | Secondary rules (guards, gates) |
| 1–49 | Annotations, informational |

## Helper reference

Run `gnat rules list-helpers` for the complete helper catalog, or see
[Rule Engine Spec](../reference/rule-engine-spec.md).
Loading
Loading