Implement analysis layer: Phase 0-2 (foundation, investigations, repo… by wrhalpin · Pull Request #73 · wrhalpin/GNAT

wrhalpin · 2026-04-07T13:27:20Z

…rting)

Phase 0 — Foundation

gnat/analysis/tlp.py: TLPLevel enum (TLP 2.0 WHITE/CLEAR/GREEN/AMBER/ AMBER+STRICT/RED) with STIX marking IDs, hex colours, rank ordering
gnat/analysis/confidence.py: ConfidenceScore combining NATO Admiralty Scale (source reliability A–F, information credibility 1–6) with STIX numeric confidence 0–100; ConfidenceLevel bands (HIGH/MEDIUM/LOW); convenience factories high/medium/low()
ADR-0031: Analysis layer architecture — layered consumer model, no new storage backend, WorkspaceStore persistence pattern
ADR-0032: STIX custom objects — x-gnat-investigation SDO, investigates relationship verb, standard report SDO for finished intelligence
ADR-0033: Confidence scoring — rationale for Admiralty Scale + STIX numeric confidence; HIGH/MEDIUM/LOW bands aligned with ATT&CK
ADR-0034: Report lifecycle — five-state machine, REVIEW→DRAFT reject path, immutability on PUBLISHED, STIX bundle triggered on publish

Phase 1 — gnat.analysis.investigations

Investigation dataclass: state machine OPEN→IN_PROGRESS→REVIEW→CLOSED, TLP classification, scope, hypotheses, analyst notes, tasks, artifact refs
Hypothesis, AnalystNote, InvestigationTask, InvestigationScope dataclasses
InvestigationStore: SQLAlchemy-backed (sqlite:///:memory: for tests), zero-migration create_all(), JSON-serialization + indexed metadata columns
InvestigationService: enforces transitions, note/task/hypothesis/artifact mutation, deduplicating tag/indicator linking, summary

Phase 2 — gnat.reporting

Report dataclass: DRAFT→REVIEW→APPROVED→PUBLISHED→ARCHIVED lifecycle, versioning with parent_report_id, TLP, findings, evidence binding, attribution, STIX export
Finding, EvidenceLink, Attribution, ReportSection, ChangelogEntry
ReportStore: same SQLAlchemy pattern as InvestigationStore
ReportService: lifecycle transitions, immutability enforcement on PUBLISHED, create_revision() for updates to published reports
report_to_stix_bundle(): STIX 2.1 bundle (report SDO + identity + threat- actor + attributed-to rel if attribution set); TLP marking refs; x_gnat_* extension fields; stix_report_ref set on publish
Three YAML report templates: incident_report, threat_actor_profile, campaign_analysis (with section structure and analyst guidance)
[analysis] and [reporting] optional dependency extras

Tests (81 tests, all passing)

tests/unit/analysis/test_confidence.py: 19 tests
tests/unit/analysis/test_investigations.py: 24 tests
tests/unit/reporting/test_reports.py: 38 tests

https://claude.ai/code/session_01BDoue9HxB83ijLzFARAugq

…rting) Phase 0 — Foundation - gnat/analysis/tlp.py: TLPLevel enum (TLP 2.0 WHITE/CLEAR/GREEN/AMBER/ AMBER+STRICT/RED) with STIX marking IDs, hex colours, rank ordering - gnat/analysis/confidence.py: ConfidenceScore combining NATO Admiralty Scale (source reliability A–F, information credibility 1–6) with STIX numeric confidence 0–100; ConfidenceLevel bands (HIGH/MEDIUM/LOW); convenience factories high/medium/low() - ADR-0031: Analysis layer architecture — layered consumer model, no new storage backend, WorkspaceStore persistence pattern - ADR-0032: STIX custom objects — x-gnat-investigation SDO, investigates relationship verb, standard report SDO for finished intelligence - ADR-0033: Confidence scoring — rationale for Admiralty Scale + STIX numeric confidence; HIGH/MEDIUM/LOW bands aligned with ATT&CK - ADR-0034: Report lifecycle — five-state machine, REVIEW→DRAFT reject path, immutability on PUBLISHED, STIX bundle triggered on publish Phase 1 — gnat.analysis.investigations - Investigation dataclass: state machine OPEN→IN_PROGRESS→REVIEW→CLOSED, TLP classification, scope, hypotheses, analyst notes, tasks, artifact refs - Hypothesis, AnalystNote, InvestigationTask, InvestigationScope dataclasses - InvestigationStore: SQLAlchemy-backed (sqlite:///:memory: for tests), zero-migration create_all(), JSON-serialization + indexed metadata columns - InvestigationService: enforces transitions, note/task/hypothesis/artifact mutation, deduplicating tag/indicator linking, summary Phase 2 — gnat.reporting - Report dataclass: DRAFT→REVIEW→APPROVED→PUBLISHED→ARCHIVED lifecycle, versioning with parent_report_id, TLP, findings, evidence binding, attribution, STIX export - Finding, EvidenceLink, Attribution, ReportSection, ChangelogEntry - ReportStore: same SQLAlchemy pattern as InvestigationStore - ReportService: lifecycle transitions, immutability enforcement on PUBLISHED, create_revision() for updates to published reports - report_to_stix_bundle(): STIX 2.1 bundle (report SDO + identity + threat- actor + attributed-to rel if attribution set); TLP marking refs; x_gnat_* extension fields; stix_report_ref set on publish - Three YAML report templates: incident_report, threat_actor_profile, campaign_analysis (with section structure and analyst guidance) - [analysis] and [reporting] optional dependency extras Tests (81 tests, all passing) - tests/unit/analysis/test_confidence.py: 19 tests - tests/unit/analysis/test_investigations.py: 24 tests - tests/unit/reporting/test_reports.py: 38 tests https://claude.ai/code/session_01BDoue9HxB83ijLzFARAugq

Copilot

Pull request overview

Introduces an initial “analysis layer” and “reporting layer” to GNAT, adding analyst-facing Investigation/Report domain models, lifecycle services, SQLAlchemy-backed persistence, and STIX 2.1 export, along with ADRs and unit tests.

Changes:

Add gnat.analysis foundation types (TLPLevel, ConfidenceScore) plus gnat.analysis.investigations (models/service/store).
Add gnat.reporting (models/service/store), YAML report templates, and STIX bundle export.
Add ADRs (0031–0034), update packaging (pyproject.toml), and add unit tests + changelog entry.

Reviewed changes

Copilot reviewed 25 out of 27 changed files in this pull request and generated 12 comments.

Show a summary per file

File	Description
`pyproject.toml`	Adds `[analysis]` / `[reporting]` extras and includes reporting templates as package data.
`gnat/analysis/__init__.py`	Exposes analysis-layer public API (confidence + TLP).
`gnat/analysis/tlp.py`	Implements TLP 2.0 enum with labels/colours/ranking and STIX marking IDs.
`gnat/analysis/confidence.py`	Implements Admiralty Scale + STIX numeric confidence composite model.
`gnat/analysis/investigations/__init__.py`	Exposes investigations public API surface.
`gnat/analysis/investigations/models.py`	Adds Investigation domain dataclasses + enums + serialization/state machine.
`gnat/analysis/investigations/service.py`	Adds Investigation lifecycle/mutation service layer.
`gnat/analysis/investigations/storage.py`	Adds SQLAlchemy persistence for investigations (JSON blob + indexed fields).
`gnat/reporting/__init__.py`	Exposes reporting public API surface and STIX export entrypoint.
`gnat/reporting/models.py`	Adds Report domain dataclasses + enums + serialization/state machine.
`gnat/reporting/service.py`	Adds Report lifecycle/mutation service layer + publish/revision workflow.
`gnat/reporting/storage.py`	Adds SQLAlchemy persistence for reports (JSON blob + indexed fields).
`gnat/reporting/export/__init__.py`	Exposes STIX export helper.
`gnat/reporting/export/stix.py`	Implements Report → STIX 2.1 bundle serialization.
`gnat/reporting/templates/incident_report.yaml`	Adds incident report YAML template and guidance.
`gnat/reporting/templates/threat_actor_profile.yaml`	Adds threat actor profile YAML template and guidance.
`gnat/reporting/templates/campaign_analysis.yaml`	Adds campaign analysis YAML template and guidance.
`tests/unit/analysis/__init__.py`	Test package marker.
`tests/unit/reporting/__init__.py`	Test package marker.
`tests/unit/analysis/test_confidence.py`	Unit coverage for TLP and confidence scoring.
`tests/unit/analysis/test_investigations.py`	Unit coverage for investigations model/store/service.
`tests/unit/reporting/test_reports.py`	Unit coverage for reporting model/store/service and STIX export.
`docs/explanation/architecture/adrs/0031-analysis-layer-architecture.md`	Documents analysis-layer architecture decisions.
`docs/explanation/architecture/adrs/0032-stix-custom-objects.md`	Documents custom STIX object/relationship decisions.
`docs/explanation/architecture/adrs/0033-confidence-scoring.md`	Documents confidence scoring rationale and conventions.
`docs/explanation/architecture/adrs/0034-report-lifecycle.md`	Documents report lifecycle state machine + publish semantics.
`CHANGELOG.md`	Adds unreleased entry describing analysis/reporting features + tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-07T13:33:14Z

+
+        report.status       = ReportStatus.PUBLISHED
+        report.published_at = datetime.now(tz=timezone.utc)
+        report.updated_at   = report.published_at


publish() sets report.updated_at = report.published_at, but ReportStore.save() unconditionally overwrites report.updated_at with datetime.now(...), so the persisted updated_at will not match published_at as intended. Consider letting the service control updated_at for publish (or have the store only set updated_at when not already set / always rely on DB onupdate).

Suggested change

report.updated_at = report.published_at

Copilot · 2026-04-07T13:33:15Z

+            if linked_investigation is not None:
+                q = q.filter(ReportModel.linked_investigation == linked_investigation)
+            if tag is not None:
+                q = q.filter(ReportModel.tags_csv.contains(tag))
+            rows = (


list(tag=...) uses tags_csv.contains(tag), which will return false positives for substring matches (e.g., tag "ware" matches "ransomware", tag "a" matches nearly everything). Consider storing tags with a delimiter strategy that supports exact matches (e.g., wrapping with commas and searching for ,tag,) or normalizing tags into a separate table/JSON array and using an exact match query.

Copilot · 2026-04-07T13:33:15Z

+- `tests/unit/analysis/test_confidence.py`: 16 tests covering TLP ordering, STIX marking IDs, confidence bands, Admiralty Scale, serialization roundtrips, bounds validation
+- `tests/unit/analysis/test_investigations.py`: 24 tests covering model roundtrips, state machine valid/invalid transitions, full service lifecycle (create/get/transition/note/task/hypothesis/link/delete/list/summary)
+- `tests/unit/reporting/test_reports.py`: 30 tests covering report model, evidence links, attribution, full DRAFT→PUBLISHED lifecycle, immutability enforcement, STIX bundle structure and field correctness, revision creation


The test counts in the release notes don’t match the actual new test files in this PR (e.g., the PR description mentions 19/24/38, but this section lists 16/24/30). Please update these numbers to reflect the current tests so the changelog remains accurate.

Suggested change

- `tests/unit/analysis/test_confidence.py`: 16 tests covering TLP ordering, STIX marking IDs, confidence bands, Admiralty Scale, serialization roundtrips, bounds validation

- `tests/unit/analysis/test_investigations.py`: 24 tests covering model roundtrips, state machine valid/invalid transitions, full service lifecycle (create/get/transition/note/task/hypothesis/link/delete/list/summary)

- `tests/unit/reporting/test_reports.py`: 30 tests covering report model, evidence links, attribution, full DRAFT→PUBLISHED lifecycle, immutability enforcement, STIX bundle structure and field correctness, revision creation

- `tests/unit/analysis/test_confidence.py`: 19 tests covering TLP ordering, STIX marking IDs, confidence bands, Admiralty Scale, serialization roundtrips, bounds validation

- `tests/unit/analysis/test_investigations.py`: 24 tests covering model roundtrips, state machine valid/invalid transitions, full service lifecycle (create/get/transition/note/task/hypothesis/link/delete/list/summary)

- `tests/unit/reporting/test_reports.py`: 38 tests covering report model, evidence links, attribution, full DRAFT→PUBLISHED lifecycle, immutability enforcement, STIX bundle structure and field correctness, revision creation

Copilot · 2026-04-07T13:33:15Z

+        These are the well-known IDs registered by FIRST at
+        ``https://www.first.org/tlp/``.
+        """
+        return _STIX_MARKING_IDS.get(self, f"marking-definition--{self.value}")
+


TLPLevel.stix_marking_id falls back to marking-definition--{self.value} for levels missing from _STIX_MARKING_IDS (notably AMBER_STRICT). That fallback is not a valid STIX marking-definition ID (must be marking-definition--<uuid>), so exporting objects classified as AMBER+STRICT will emit invalid object_marking_refs. Add the official marking-definition UUID for AMBER_STRICT (and any other missing levels) or raise a clear error rather than returning an invalid ID.

Copilot · 2026-04-07T13:33:15Z

+REPORT_TRANSITIONS: dict[ReportStatus, frozenset[ReportStatus]] = {
+    ReportStatus.DRAFT:     frozenset({ReportStatus.REVIEW, ReportStatus.ARCHIVED}),
+    ReportStatus.REVIEW:    frozenset({ReportStatus.DRAFT, ReportStatus.APPROVED, ReportStatus.ARCHIVED}),
+    ReportStatus.APPROVED:  frozenset({ReportStatus.PUBLISHED, ReportStatus.DRAFT, ReportStatus.ARCHIVED}),
+    ReportStatus.PUBLISHED: frozenset({ReportStatus.ARCHIVED}),


REPORT_TRANSITIONS allows APPROVED → DRAFT, but ADR-0034 describes DRAFT ↔ REVIEW as the only bidirectional transition (approval is meant to be a review gate). Either remove ReportStatus.DRAFT from the allowed transitions out of APPROVED, or update ADR-0034 to match the intended lifecycle.

Copilot · 2026-04-07T13:33:16Z

+        if note and author:
+            inv.notes.append(AnalystNote(
+                content = f"**Status changed:** `{old_status.value}` → `{new_status.value}`\n\n{note}",
+                author  = author,
+            ))


transition() says author is required if note is provided, but if note is set without author the implementation silently drops the note. Raise InvestigationError when note is provided without author (or update the docstring to reflect the actual behavior).

Copilot · 2026-04-07T13:33:17Z

+                q = q.filter(InvestigationModel.created_by == created_by)
+            if tag is not None:
+                q = q.filter(InvestigationModel.tags_csv.contains(tag))
+            rows = q.order_by(InvestigationModel.updated_at.desc()).offset(offset).limit(limit).all()


list(tag=...) uses tags_csv.contains(tag), which can produce substring false positives (e.g., "ware" matches "ransomware"). Consider an exact-match strategy (delimiter wrapping) or a normalized tag representation to avoid incorrect filtering results.

Copilot · 2026-04-07T13:33:17Z

+    if not _SA_AVAILABLE:
+        raise ImportError(
+            "sqlalchemy is required for report persistence. "
+            "Install with: pip install 'gnat[persist]'"


_require_sqlalchemy() instructs users to install gnat[persist], but this PR also adds [reporting] extras that include SQLAlchemy. Consider updating the message to mention the relevant extras (e.g., gnat[reporting] / gnat[persist]) so installation guidance matches packaging options.

Suggested change

"Install with: pip install 'gnat[persist]'"

"Install with: pip install 'gnat[reporting]' or pip install 'gnat[persist]'"

Copilot · 2026-04-07T13:33:17Z

+    if not _SA_AVAILABLE:
+        raise ImportError(
+            "sqlalchemy is required for investigation persistence. "
+            "Install with: pip install 'gnat[persist]'"


_require_sqlalchemy() instructs users to install gnat[persist], but this PR adds an [analysis] extra that also includes SQLAlchemy. Consider updating the guidance string to reference the relevant extras so installation instructions remain accurate.

Suggested change

"Install with: pip install 'gnat[persist]'"

"Install with: pip install 'gnat[persist]' or pip install 'gnat[analysis]'"

Copilot · 2026-04-07T13:33:17Z

+key_findings, evidence_links) become read-only. Updates produce a new
+Report version with `parent_report_id` pointing to the previous
+published version and `version` incremented. This mirrors the STIX 2.1
+versioning model where `modified` creates a logical new version rather
+than mutating the original.
+
+**Versioning implementation:**
+`ReportService.publish(report_id)` increments `version`, sets
+`published_at`, generates the STIX bundle, and marks content as
+immutable via a `is_published` flag in storage. A new draft is created
+with `parent_report_id` set when an analyst wants to revise a published
+report.


ADR-0034’s “Versioning implementation” section says publish() increments version and that immutability is marked via an is_published flag in storage, but the current implementation neither increments Report.version on publish nor stores an is_published flag (immutability is enforced via Report.status). Update the ADR or the implementation so they match.

Suggested change

key_findings, evidence_links) become read-only. Updates produce a new

Report version with `parent_report_id` pointing to the previous

published version and `version` incremented. This mirrors the STIX 2.1

versioning model where `modified` creates a logical new version rather

than mutating the original.

**Versioning implementation:**

`ReportService.publish(report_id)` increments `version`, sets

`published_at`, generates the STIX bundle, and marks content as

immutable via a `is_published` flag in storage. A new draft is created

with `parent_report_id` set when an analyst wants to revise a published

report.

key_findings, evidence_links) become read-only. Immutability is

enforced by `Report.status = PUBLISHED`, rather than by a separate

storage flag. Updates to published content produce a new draft version

with `parent_report_id` pointing to the previous published report and

`version` incremented for that new draft. This mirrors the STIX 2.1

versioning model where `modified` creates a logical new version rather

than mutating the original.

**Versioning implementation:**

`ReportService.publish(report_id)` transitions the report to

PUBLISHED, sets `published_at`, and generates the STIX bundle.

Content immutability is enforced by the PUBLISHED status. When an

analyst wants to revise a published report, the system creates a new

draft with `parent_report_id` set to the prior published report and

an incremented `version`.

Copilot AI review requested due to automatic review settings April 7, 2026 13:27

wrhalpin merged commit 91f87cb into main Apr 7, 2026
10 of 22 checks passed

Copilot started reviewing on behalf of wrhalpin April 7, 2026 13:28 View session

Copilot AI reviewed Apr 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement analysis layer: Phase 0-2 (foundation, investigations, repo…#73

Implement analysis layer: Phase 0-2 (foundation, investigations, repo…#73
wrhalpin merged 1 commit intomainfrom
claude/add-claude-documentation-k8vvJ

wrhalpin commented Apr 7, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Copilot AI Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	"Install with: pip install 'gnat[persist]'"
	"Install with: pip install 'gnat[reporting]' or pip install 'gnat[persist]'"

Conversation

wrhalpin commented Apr 7, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants