Add Phase 4 documentation: ADRs 0039-0049, how-to guides, Sphinx API … by wrhalpin · Pull Request #94 · wrhalpin/GNAT

wrhalpin · 2026-04-09T00:35:00Z

…docs, diagrams

ADRs 0039-0049 covering all Phase 4 subsystems: ExecutionContext, connector trust model, idempotency/schema evolution, hypothesis engine, negative evidence, reasoning engine, agent governance, HITL gateway, workspace isolation, query budget, and the simulation/replay testing framework
Updated ADR README Table of Contents with entries 38-49 (including Phase 4 section)
Extended docs/explanation/architecture/diagrams.md with Phase 4 layer table rows and Control Layer Mermaid diagram; added ADR cross-links 0039-0049
Added 5 new Mermaid sequence/flowchart diagrams to workflow-diagrams.md covering ExecutionContext propagation, HypothesisEngine lifecycle, ReasoningEngine scoring, Agent Governance & HITL flow, and workspace trust boundary enforcement
Added [agent_policy], [connector_limits], [workspace_defaults], [execution_context] sections to docs/reference/configuration.md
Three new how-to guides: use-execution-context.md, use-reasoning-engine.md, agent-governance.md; added Phase 4 section to how-to/README.md
New Sphinx RST pages: reasoning.rst, agents_governance.rst, api/core.rst, api/reasoning.rst, api/agents_governance.rst; updated index.rst toctrees

https://claude.ai/code/session_01BDoue9HxB83ijLzFARAugq

…docs, diagrams - ADRs 0039-0049 covering all Phase 4 subsystems: ExecutionContext, connector trust model, idempotency/schema evolution, hypothesis engine, negative evidence, reasoning engine, agent governance, HITL gateway, workspace isolation, query budget, and the simulation/replay testing framework - Updated ADR README Table of Contents with entries 38-49 (including Phase 4 section) - Extended docs/explanation/architecture/diagrams.md with Phase 4 layer table rows and Control Layer Mermaid diagram; added ADR cross-links 0039-0049 - Added 5 new Mermaid sequence/flowchart diagrams to workflow-diagrams.md covering ExecutionContext propagation, HypothesisEngine lifecycle, ReasoningEngine scoring, Agent Governance & HITL flow, and workspace trust boundary enforcement - Added [agent_policy], [connector_limits], [workspace_defaults], [execution_context] sections to docs/reference/configuration.md - Three new how-to guides: use-execution-context.md, use-reasoning-engine.md, agent-governance.md; added Phase 4 section to how-to/README.md - New Sphinx RST pages: reasoning.rst, agents_governance.rst, api/core.rst, api/reasoning.rst, api/agents_governance.rst; updated index.rst toctrees https://claude.ai/code/session_01BDoue9HxB83ijLzFARAugq

Copilot

Pull request overview

Adds Phase 4 (“control/safety”, reasoning, and agent governance) documentation across ADRs, how-to guides, config reference, and Sphinx pages so users can discover and use the new subsystems.

Changes:

Added ADRs 0039–0049 describing Phase 4 subsystems (ExecutionContext/trust/budget, reasoning/hypotheses/negative evidence, governance/HITL, workspace isolation, simulation/replay testing).
Added/updated user-facing docs: new how-to guides, updated configuration reference, and expanded architecture diagrams/workflows.
Extended Sphinx docs: new narrative pages plus new API reference entry points and updated toctrees.

Reviewed changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated 29 comments.

Show a summary per file

File	Description
docs/explanation/architecture/adrs/0039-ADR-execution-context.md	ExecutionContext ADR (factored factories, budget, persistence)
docs/explanation/architecture/adrs/0040-ADR-connector-trust-model.md	Trust-level model ADR and its integration with ExecutionContext
docs/explanation/architecture/adrs/0041-ADR-idempotency-schema-evolution.md	Idempotency/schema evolution ADR
docs/explanation/architecture/adrs/0042-ADR-hypothesis-engine.md	Hypothesis engine ADR (lifecycle + evaluation model)
docs/explanation/architecture/adrs/0043-ADR-negative-evidence.md	Negative-evidence ADR (TTL suppression model)
docs/explanation/architecture/adrs/0044-ADR-reasoning-engine.md	Reasoning engine ADR (scoring + Solr corroboration + notes)
docs/explanation/architecture/adrs/0045-ADR-agent-governance.md	Agent governance ADR (permission matrix, rate limits, audit log)
docs/explanation/architecture/adrs/0046-ADR-hitl-gateway.md	HITL gateway ADR (review queue bridge + escalation)
docs/explanation/architecture/adrs/0047-ADR-workspace-isolation.md	Workspace isolation ADR (trust boundaries + connector allowlists)
docs/explanation/architecture/adrs/0048-ADR-query-budget.md	Query budget ADR (budget charging + exhaustion behavior)
docs/explanation/architecture/adrs/0049-ADR-testing-framework.md	Simulation/replay/testing ADR (SimulationConnector/ReplayRunner/Harness)
docs/explanation/architecture/adrs/README.md	ADR TOC updates for 0039–0049
docs/explanation/architecture/diagrams.md	Expanded architecture diagrams to include Phase 4 layers
docs/explanation/architecture/workflow-diagrams.md	Added Phase 4 workflow mermaid diagrams (context propagation, reasoning, governance, trust boundaries)
docs/reference/configuration.md	Documented new Phase 4 configuration sections/keys
docs/how-to/use-execution-context.md	How-to guide for creating/propagating ExecutionContext, boundaries, and budgets
docs/how-to/use-reasoning-engine.md	How-to guide for ReasoningEngine/HypothesisEngine/NegativeEvidence + Solr integration
docs/how-to/agent-governance.md	How-to guide for AgentGovernor/HITL + testing harness usage
docs/how-to/README.md	Updated how-to index to include Phase 4 guides
docs/sphinx-html/source/index.rst	Added new Sphinx toctree entries for Phase 4 docs + API pages
docs/sphinx-html/source/reasoning.rst	New Sphinx narrative page for the reasoning layer
docs/sphinx-html/source/agents_governance.rst	New Sphinx narrative page for agent governance/HITL
docs/sphinx-html/source/api/core.rst	New Sphinx API reference entry point for gnat.core
docs/sphinx-html/source/api/reasoning.rst	New Sphinx API reference entry point for gnat.reasoning
docs/sphinx-html/source/api/agents_governance.rst	New Sphinx API reference entry point for governance/HITL/testing

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-09T00:51:27Z

+## 9. Hypothesis Engine Lifecycle (Phase 4C)
+
+The full propose → evaluate → close lifecycle for `STIXHypothesis` objects, showing
+how Solr corroboration and trust-weighted evidence feed into confidence updates.


This workflow text says HypothesisEngine confidence updates use "trust-weighted evidence", but the current HypothesisEngine.evaluate() implementation does not apply trust weights. Consider rewording to just describe Solr corroboration (or update the implementation to weight by trust if that’s a requirement).

Suggested change

how Solr corroboration and trust-weighted evidence feed into confidence updates.

how Solr corroboration and supporting/refuting evidence feed into confidence updates.

Copilot · 2026-04-09T00:51:28Z

+The `AgentTestHarness` provides a fully deterministic test environment — all HITL
+submissions are auto-approved and all rate limits are effectively unlimited:


AgentTestHarness does not auto-approve high/critical actions: HITLGateway.evaluate() returns (False, review_item) for impact_level high/critical even when using the mock review service. Update this text to clarify that the harness makes review submission deterministic (but still not auto-approved for high/critical) or update the harness/gateway behavior to match the claim.

Suggested change

The `AgentTestHarness` provides a fully deterministic test environment — all HITL

submissions are auto-approved and all rate limits are effectively unlimited:

The `AgentTestHarness` provides a fully deterministic test environment — review

submissions are handled deterministically by the mock HITL flow, low-impact actions

like the example below are auto-approved, and rate limits are effectively unlimited.

High- and critical-impact actions are still submitted for review rather than being

auto-approved:

Copilot · 2026-04-09T00:51:28Z

+print(f"{neg['count']} fresh neg records, penalty={neg['penalty']:.2f}")
+
+corr = explanation["components"]["corroboration"]
+print(f"{corr['hits']} Solr hits, bonus={corr['bonus']:.2f}")


The corroboration component in ReasoningEngine’s explanation dict reports a derived/capped hits count (based on bonus/0.05), not the raw Solr hit count. This example prints it as "Solr hits", which is misleading once the bonus cap is reached. Consider renaming the field in docs (e.g. "effective_hits") or updating the engine to record the actual hit count separately.

Suggested change

print(f"{corr['hits']} Solr hits, bonus={corr['bonus']:.2f}")

print(f"{corr['hits']} effective hits, bonus={corr['bonus']:.2f}")

Copilot · 2026-04-09T00:51:28Z

+* :class:`~gnat.agents.hitl.HITLGateway` — bridges ``AgentGovernor`` to the existing
+  :class:`~gnat.review.service.ReviewService`; low/medium-impact actions are
+  auto-approved, high-impact actions block until a human reviewer approves, and
+  critical actions also trigger XSOAR notifications.


This overview says high-impact actions "block until a human reviewer approves", but HITLGateway.evaluate() currently returns (False, review_item) for high/critical and does not block; callers must poll check_approval_status() and decide when to execute. Please adjust the wording to reflect the non-blocking API (or implement a blocking helper if that’s intended).

Copilot · 2026-04-09T00:51:28Z

+   :undoc-members:
+
+ExecutionContext
+---------------


The underline for the "ExecutionContext" section title is one character shorter than the title text, which can trigger docutils/Sphinx warnings ("Title underline too short"). Extend the underline so it is at least the same length as the title.

Suggested change

---------------

----------------

Copilot · 2026-04-09T00:51:33Z

+The mapping is defined in `gnat/core/trust.py` as `TRUST_WEIGHTS: dict[str, float]`
+and shared between `ExecutionContext`, `HypothesisEngine`, and `ReasoningEngine`
+to ensure a single source of truth.


This claims the trust-weight mapping lives in gnat/core/trust.py as TRUST_WEIGHTS, but there is no gnat.core.trust module in the codebase and the trust→weight mapping is currently defined inside gnat.reasoning.engine (_TRUST_WEIGHTS). Update this reference so readers can find the real source of truth.

Suggested change

The mapping is defined in `gnat/core/trust.py` as `TRUST_WEIGHTS: dict[str, float]`

and shared between `ExecutionContext`, `HypothesisEngine`, and `ReasoningEngine`

to ensure a single source of truth.

The current implementation defines this mapping in `gnat.reasoning.engine`

as `_TRUST_WEIGHTS: dict[str, float]`. This ADR should reference that module as

the current source of truth for the trust→weight mapping.

Copilot · 2026-04-09T00:51:33Z

+class NullSearchIndex(SearchIndex):
+    """No-op search index used when Solr is unavailable."""
+
+    def query(self, query: str, fields: list[str] | None = None) -> list[dict]:


This NullSearchIndex example defines a query() method, but the actual search interface is SearchIndex.search(...) (and NullSearchIndex implements search, not query). Update this ADR snippet to match the real gnat.search.index.SearchIndex API.

Suggested change

def query(self, query: str, fields: list[str] | None = None) -> list[dict]:

def search(self, query: str, fields: list[str] | None = None) -> list[dict]:

Copilot · 2026-04-09T00:51:33Z

+```python
+from gnat.reasoning.engine import ReasoningEngine
+from gnat.search import GNATIndexer
+from gnat.core.context import ExecutionContext
+


gnat.search.GNATIndexer is referenced here, but there is no GNATIndexer symbol in gnat.search. The existing Solr integration uses SolrSearchIndex / SearchIndex and the build_search_index() factory. Update this example to use the actual search classes.

Copilot · 2026-04-09T00:51:34Z

+    store=workspace_store,
+    search_index=GNATIndexer.from_config(config),
+    neg_store=neg_evidence_store,


This ReasoningEngine construction uses store=..., neg_store=... parameters, but the implemented gnat.reasoning.engine.ReasoningEngine constructor takes manager + workspace_name (and gathers negative evidence from the workspace). Update the ADR usage example to match the current engine signature.

Suggested change

store=workspace_store,

search_index=GNATIndexer.from_config(config),

neg_store=neg_evidence_store,

manager=workspace_manager,

workspace_name="acme-corp",

search_index=GNATIndexer.from_config(config),

Copilot · 2026-04-09T00:51:34Z

+        trust_weights: dict[str, float] | None = None,
+    ) -> None:
+        self._store = store
+        self._search = search_index
+        self._weights = trust_weights or TRUST_WEIGHTS  # from gnat.core.trust


This snippet references TRUST_WEIGHTS "from gnat.core.trust", but there is no gnat.core.trust module in the codebase and the implemented HypothesisEngine constructor does not accept trust_weights. Update the snippet to match the real constructor and where trust weights are actually defined/used.

Suggested change

trust_weights: dict[str, float] | None = None,

) -> None:

self._store = store

self._search = search_index

self._weights = trust_weights or TRUST_WEIGHTS # from gnat.core.trust

) -> None:

self._store = store

self._search = search_index

# Trust weights are not injected via the constructor; they are

# applied by the evaluation/scoring logic where they are defined.

Copilot AI review requested due to automatic review settings April 9, 2026 00:35

wrhalpin merged commit df3656e into main Apr 9, 2026
7 of 19 checks passed

Copilot started reviewing on behalf of wrhalpin April 9, 2026 00:35 View session

Copilot AI reviewed Apr 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Phase 4 documentation: ADRs 0039-0049, how-to guides, Sphinx API …#94

Add Phase 4 documentation: ADRs 0039-0049, how-to guides, Sphinx API …#94
wrhalpin merged 1 commit intomainfrom
claude/add-claude-documentation-k8vvJ

wrhalpin commented Apr 9, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Copilot AI Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	how Solr corroboration and trust-weighted evidence feed into confidence updates.
	how Solr corroboration and supporting/refuting evidence feed into confidence updates.

		The `AgentTestHarness` provides a fully deterministic test environment — all HITL
		submissions are auto-approved and all rate limits are effectively unlimited:

-The `AgentTestHarness` provides a fully deterministic test environment — all HITL
-submissions are auto-approved and all rate limits are effectively unlimited:
+The `AgentTestHarness` provides a fully deterministic test environment — review
+submissions are handled deterministically by the mock HITL flow, low-impact actions
+like the example below are auto-approved, and rate limits are effectively unlimited.
+High- and critical-impact actions are still submitted for review rather than being
+auto-approved:

	print(f"{corr['hits']} Solr hits, bonus={corr['bonus']:.2f}")
	print(f"{corr['hits']} effective hits, bonus={corr['bonus']:.2f}")

	def query(self, query: str, fields: list[str] \| None = None) -> list[dict]:
	def search(self, query: str, fields: list[str] \| None = None) -> list[dict]:

Conversation

wrhalpin commented Apr 9, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants