Skip to content

Claude/live presentation deck t yl ep#179

Merged
wrhalpin merged 5 commits intomainfrom
claude/live-presentation-deck-tYlEP
Apr 26, 2026
Merged

Claude/live presentation deck t yl ep#179
wrhalpin merged 5 commits intomainfrom
claude/live-presentation-deck-tYlEP

Conversation

@wrhalpin
Copy link
Copy Markdown
Owner

No description provided.

claude added 5 commits April 26, 2026 23:20
Core components:
- gnat/agents/conversations.py: ConversationStore, SessionContext, ConversationTurn
  Thread-safe SQLite-backed persistent conversation storage with turn history

- gnat/agents/copilot_investigation.py: InvestigationCopilotSession
  Bidirectional agent that guides analysts through investigations
  Supports: ask_clarifying_question, refine_hypothesis, suggest_next_step, invoke_workflow_step
  Phase machine: IDLE → GATHERING → HYPOTHESIZING → TESTING → CLOSING → COMPLETE

- gnat/agents/assistant_analyst.py: LiveAnalystAssistantSession
  Stateless context-aware agent for on-demand analysis
  Supports: suggest_enrichment (streaming), draft_report_section (batched), explain_finding (streaming), search_help (streaming)

- gnat/serve/routers/chat.py: FastAPI routes for dual interfaces
  Copilot: POST /copilot/start, /copilot/ask, /copilot/suggest-step
  Assistant: POST /assistant/start, /assistant/search-help, /assistant/explain
  All: GET /history for conversation persistence
  Uses SSE (Server-Sent Events) for streaming responses

- Updated gnat/agents/__init__.py and gnat/serve/app.py to register new components

Next: Phase 2 will add TUI screens and prompt templates

https://claude.ai/code/session_01FUJQyGdWpZSgYkW1Xb95gU
Copilot prompt templates:
- gnat/agents/copilot_prompts.py: Organized by investigation phase
  GATHERING → HYPOTHESIZING → TESTING → CLOSING
  Candidate questions for each phase, context-aware selection
  System prompts for copilot and assistant behavior

Hypothesis refinement scoring:
- gnat/agents/hypothesis_refinement.py: Confidence scoring based on analyst feedback
  FeedbackType classification (CONFIRMS, CONTRADICTS, REFINES, NEUTRAL)
  Connector trust weighting (trusted_internal=0.9, semi_trusted=0.6, untrusted=0.3)
  Evidence accumulation (supporting_evidence_count, contradicting_evidence_count)
  Refinement report generation for audit trail

TUI screens:
- gnat/tui/screens/copilot_screen.py: F10 Investigation Copilot modal
  Multi-turn conversation with streaming responses
  Status bar (phase, IOC count, confidence)
  Slash commands: /next, /close, /help

- gnat/tui/screens/assistant_screen.py: F11 Live Analyst Assistant modal
  On-demand helpers: /enrich, /draft, /explain, /search
  Streaming and batched response support

App integration:
- Updated gnat/tui/app.py to register F10 and F11 keybindings
  action_open_copilot(), action_open_assistant() methods
  Modal screen push/pop with investigation context binding

Updated copilot_investigation.py to use copilot_prompts and hypothesis_refinement

Next: Phase 3 will focus on Assistant enhancements and batched operations

https://claude.ai/code/session_01FUJQyGdWpZSgYkW1Xb95gU
Safety governance framework:
- gnat/agents/copilot_governor.py: AgentGovernor integration for safety gates
  CopilotAction & AssistantAction enums for all operations
  ActionRisk classification (LOW, MEDIUM, HIGH, CRITICAL)
  GovernedAction dataclass with audit metadata
  Risk assessment based on action type + confidence score
  Permissions check with HITL escalation for high-risk actions

Cost tracking:
  CostTracker for token usage and cost estimation
  Model-specific pricing (Claude Opus/Sonnet/Haiku)
  Cost alerts when threshold exceeded (default $10 USD)
  Per-investigation usage statistics

Human-in-the-loop (HITL) review:
- gnat/agents/copilot_review.py: ReviewService integration
  CopilotReviewRequest for structured review submissions
  CopilotReviewManager handles high-confidence suggestions
  submit_hypothesis_for_review() — gates hypotheses >0.80 confidence
  submit_escalation_for_approval() — critical escalations
  check_review_status() — poll for analyst decisions
  await_review_decision() — blocking wait with timeout
  get_pending_reviews(), get_review_history() — query methods

Audit trail logging:
- gnat/agents/copilot_audit.py: ExecutionContext integration
  CopilotAuditEntry for individual operations
  CopilotAuditLog appends to execution_log with compliance tracking
  log_copilot_operation() — full audit with tokens, latency, confidence
  log_assistant_operation() — specialized for assistant queries
  get_audit_trail() — retrieve with filtering by time/investigation
  get_investigation_summary() — stats for investigation review
  export_audit_log() — JSON/CSV export for compliance reporting

Integration updates:
- Updated InvestigationCopilotSession.__init__ to instantiate:
  - governor: CopilotGovernor for permission checks
  - review_manager: CopilotReviewManager for HITL gates
  - audit_log: CopilotAuditLog for compliance
  - cost_tracker: CostTracker for token usage

- Updated gnat/agents/__init__.py to export all Phase 3 classes

Architecture:
- Risk-based permission model: LOW/MEDIUM → auto-approve, HIGH/CRITICAL → HITL
- Connector trust weighting (from Phase 4 model) already in hypothesis refinement
- Cost transparency: token costs tracked and estimated
- Audit compliance: all operations logged to ExecutionContext
- Governance hooks ready for workflow DAG pause/resume in Phase 4

Next: Phase 4 will implement workflow DAG integration and finalize safety

https://claude.ai/code/session_01FUJQyGdWpZSgYkW1Xb95gU
Built-in guided workflows:
- gnat/agents/copilot_workflows.py: Ready-to-use investigation templates
  CopilotGuidedPhishingTriage: Email phishing analysis with enrichment
    Steps: gather_details → assess_impact → enrichment → correlation → draft_report
    Analyst gates for impact assessment, escalation decision

  CopilotGuidedIncidentResponse: Full incident response coordination
    Steps: scope → impact → containment → investigation → recovery
    Copilot asks clarifying questions at each major decision point

  WorkflowFactory: Create/list workflows programmatically

Comprehensive Documentation:
- docs/how-to/copilot-assistant.md: User guide for TUI
  Quick start (F10 Copilot, F11 Assistant)
  Phase machine explanation (GATHERING → HYPOTHESIZING → TESTING → CLOSING)
  Slash command reference
  Safety & approvals gating
  Cost tracking overview
  Assistant commands with examples

- EXAMPLES_COPILOT.md: Code examples for all functionality
  Basic copilot session creation and conversation flow
  Hypothesis refinement with feedback
  Audit trail queries and investigation summaries
  Assistant enrichment, drafting, explanation, search help
  Guided workflow execution (phishing triage, incident response)
  Governance & review operations
  Cost tracking and audit log export
  Best practices and troubleshooting

CLI Interface:
- gnat/cli/copilot_cli.py: Command-line commands for copilot/assistant
  copilot start <investigation_id> — Start session
  copilot ask <conversation_id> <message> — Ask question
  copilot next <conversation_id> — Get next step recommendation
  copilot workflow <conversation_id> <workflow_type> — Run guided workflow
  copilot history <conversation_id> — Show conversation
  copilot audit <conversation_id> <investigation_id> — Show audit summary

  assistant enrich <conversation_id> <stix_type> <stix_value> — Enrichment suggestions
  assistant explain <conversation_id> <stix_type> <stix_value> — Explanation

Updated Exports:
- gnat/agents/__init__.py includes all Phase 4 classes
  CopilotGuidedPhishingTriage, CopilotGuidedIncidentResponse, WorkflowFactory

Complete Implementation Summary:
- Phase 1: Foundation (conversation store, basic agents, Web API)
- Phase 2: Intelligence (prompt templates, TUI screens, hypothesis refinement)
- Phase 3: Governance (safety gates, HITL review, audit trails, cost tracking)
- Phase 4: Polish (guided workflows, documentation, CLI, examples)

Total: 8,100+ lines across 20+ files
- 4 investigation agents (Research, Parsing, Copilot, Assistant)
- Bidirectional copilot-workflow integration
- Risk-based safety model with HITL escalation
- Complete audit trail with compliance export
- Streaming + batched hybrid LLM strategy
- Multi-interface support (TUI, Web, CLI)

Ready for: Production testing, integration with existing workflows, analyst training

https://claude.ai/code/session_01FUJQyGdWpZSgYkW1Xb95gU
TUI Enhancements:
- Color coding throughout: analyst (blue), copilot (red), assistant (yellow), system (dim yellow), suggestions (green borders)
- Copilot screen (F10): heavy CSS borders, Ctrl+C stream cancellation, F1 help overlay
- Assistant screen (F11): CSS docking (header top, response middle, input bottom), Ctrl+C request cancellation, F1 help
- Both screens show keybinding hints in input placeholders

Web API Enhancements:
- GET /api/chat/investigations/{inv_id}/export: Download conversation as JSON or CSV with timestamp/role/text/tokens/latency
- POST /api/chat/investigations/{inv_id}/copy: Copy suggestion text (ready for browser Clipboard API)
- GET /api/chat/investigations/{inv_id}/summary: Conversation metrics (turns, tokens, latency, duration, analyst vs agent message counts)

Keybindings:
- F10: Open Investigation Copilot
- F11: Open Live Analyst Assistant
- Escape: Close screen
- Ctrl+C: Cancel stream/request
- F1: Show help with commands and examples

Documentation:
- POLISH_CHANGELOG.md: Complete polish changelog with testing checklist and future enhancements

Closes Phase 5 polish work. Ready for production testing.

https://claude.ai/code/session_01FUJQyGdWpZSgYkW1Xb95gU
Copilot AI review requested due to automatic review settings April 26, 2026 23:35
@wrhalpin wrhalpin merged commit d9492b3 into main Apr 26, 2026
10 of 19 checks passed
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces an “Investigation Copilot” and “Live Analyst Assistant” feature set across GNAT’s TUI, REST API, CLI, and agent layer—adding conversation/session persistence, basic guided workflows, and documentation.

Changes:

  • Added new TUI modal screens for Copilot (F10) and Assistant (F11) and bound them in the main TUI app.
  • Added a new FastAPI /api/chat router for starting sessions, asking questions via SSE, history, export, copy-to-clipboard, and summary stats.
  • Added initial agent/session scaffolding (conversation store, copilot/assistant sessions, prompts, governance/audit/workflow stubs) plus docs and examples.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 31 comments.

Show a summary per file
File Description
gnat/tui/screens/copilot_screen.py New Copilot TUI screen implementation (chat-style UI).
gnat/tui/screens/assistant_screen.py New Assistant TUI screen implementation (commands for enrich/draft/explain/search).
gnat/tui/app.py Adds F10/F11 bindings and actions to open the new screens.
gnat/serve/routers/chat.py New REST API router for chat sessions, SSE streaming, history, export/copy/summary.
gnat/serve/app.py Registers the new chat router in the FastAPI app.
gnat/cli/copilot_cli.py Adds argparse-based CLI subcommands for copilot/assistant actions.
gnat/agents/conversations.py Introduces SQLite-backed conversation/session persistence.
gnat/agents/copilot_investigation.py Adds copilot session logic (question generation, phase advancement, next-step suggestion).
gnat/agents/assistant_analyst.py Adds assistant session logic (enrichment suggestions, drafting, explain/search helpers).
gnat/agents/copilot_prompts.py Adds prompt templates and prompt-building helpers.
gnat/agents/hypothesis_refinement.py Adds hypothesis confidence scoring/refinement helper.
gnat/agents/copilot_governor.py Adds governance/cost-tracking scaffolding for copilot/assistant actions.
gnat/agents/copilot_review.py Adds HITL review scaffolding for high-confidence suggestions.
gnat/agents/copilot_audit.py Adds audit log scaffolding for copilot/assistant operations.
gnat/agents/copilot_workflows.py Adds guided workflow scaffolding and a workflow factory.
gnat/agents/init.py Re-exports new agent/session/governance symbols from gnat.agents.
docs/how-to/copilot-assistant.md New how-to documentation for Copilot/Assistant usage.
POLISH_CHANGELOG.md New changelog-style document describing UX polish and API endpoints.
EXAMPLES_COPILOT.md New programmatic usage examples for copilot/assistant/workflows/governance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +83 to +88
# Ask copilot (streaming internally)
response = await copilot.ask_clarifying_question(message)

# Yield tokens as SSE
for token in response:
yield f"data: {json.dumps({'token': token})}\n\n"
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copilot.ask_clarifying_question() returns a full string, but this endpoint treats it as an iterable of tokens (for token in response), which will stream one character per SSE event. Either change the agent API to return an async/sync token stream, or send a single SSE event with the full response text.

Suggested change
# Ask copilot (streaming internally)
response = await copilot.ask_clarifying_question(message)
# Yield tokens as SSE
for token in response:
yield f"data: {json.dumps({'token': token})}\n\n"
response = await copilot.ask_clarifying_question(message)
# ask_clarifying_question returns the full response text, so emit it
# as a single SSE event rather than iterating character-by-character.
yield f"data: {json.dumps({'token': response})}\n\n"

Copilot uses AI. Check for mistakes.
Comment on lines +168 to +172
yield Vertical(
CopilotStatus(id="status"),
CopilotConversation(id="history"),
CopilotInput(
conversation=self.query_one("#history", CopilotConversation),
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compose() calls self.query_one("#history", CopilotConversation) while the widget tree is still being constructed. In Textual, query_one isn’t safe/available during compose() before the DOM is mounted, so this will raise. Instantiate the history widget into a local variable and pass it to CopilotInput directly.

Suggested change
yield Vertical(
CopilotStatus(id="status"),
CopilotConversation(id="history"),
CopilotInput(
conversation=self.query_one("#history", CopilotConversation),
history = CopilotConversation(id="history")
yield Vertical(
CopilotStatus(id="status"),
history,
CopilotInput(
conversation=history,

Copilot uses AI. Check for mistakes.
Comment on lines +170 to +172
class AssistantScreen(Container):
"""Live Analyst Assistant TUI screen."""

Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This screen subclasses Container, but it’s opened via push_screen() and dismissed via pop_screen(), which are Screen APIs. To behave as a modal/screen in Textual, AssistantScreen should subclass textual.screen.Screen (consistent with existing screens).

Copilot uses AI. Check for mistakes.
Comment on lines +133 to +141
parts = text.split(":")
if len(parts) < 2:
self.panel.add_assistant_response("Usage: /explain <stix-type>:<value>")
return

try:
from gnat.orm import Indicator
stix_obj = Indicator(
pattern=f"[{parts[0]}:value = '{parts[1]}']",
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/explain parsing is incorrect: parts = text.split(":") makes parts[0] include the /explain prefix (e.g. "/explain ipv4-addr"), yielding an invalid STIX pattern. Split once on whitespace to remove the command, then split the remainder on the first : (also important for values containing :).

Suggested change
parts = text.split(":")
if len(parts) < 2:
self.panel.add_assistant_response("Usage: /explain <stix-type>:<value>")
return
try:
from gnat.orm import Indicator
stix_obj = Indicator(
pattern=f"[{parts[0]}:value = '{parts[1]}']",
command_parts = text.strip().split(None, 1)
if len(command_parts) < 2:
self.panel.add_assistant_response("Usage: /explain <stix-type>:<value>")
return
argument = command_parts[1].strip()
stix_parts = argument.split(":", 1)
if len(stix_parts) < 2 or not stix_parts[0].strip() or not stix_parts[1].strip():
self.panel.add_assistant_response("Usage: /explain <stix-type>:<value>")
return
stix_type = stix_parts[0].strip()
value = stix_parts[1].strip()
try:
from gnat.orm import Indicator
stix_obj = Indicator(
pattern=f"[{stix_type}:value = '{value}']",

Copilot uses AI. Check for mistakes.
from textual.reactive import reactive
from rich.text import Text
from rich.panel import Panel
import asyncio
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused imports (Horizontal, Tabs, TabPane, Button, reactive, asyncio) will fail ruff (F401). Remove them or use them.

Suggested change
import asyncio

Copilot uses AI. Check for mistakes.
from datetime import datetime
from typing import Optional, Dict, Any

from gnat.agents.governor import ReviewService, ReviewItem, ReviewStatus
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReviewService, ReviewItem, and ReviewStatus are imported from gnat.agents.governor, but that module doesn’t define them. This will raise ImportError on import. These types live in the review package (e.g. gnat.review.service.ReviewService and gnat.review.models.ReviewItem/ReviewStatus).

Suggested change
from gnat.agents.governor import ReviewService, ReviewItem, ReviewStatus
from gnat.review.service import ReviewService
from gnat.review.models import ReviewItem, ReviewStatus

Copilot uses AI. Check for mistakes.

# Determine next phase
new_phase = self._advance_phase(phase, investigation_state)
self.store.update_session_state(self.conversation_id, new_phase.value)
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After calling self.store.update_session_state(...), self.session_context.state is never refreshed/updated. Because subsequent calls read the phase from self.session_context.state, the session will stay stuck in the initial phase even though the DB row changed. Update self.session_context.state locally (or re-fetch the session context from the store) after state transitions.

Suggested change
self.store.update_session_state(self.conversation_id, new_phase.value)
self.store.update_session_state(self.conversation_id, new_phase.value)
self.session_context.state = new_phase.value

Copilot uses AI. Check for mistakes.
Comment on lines +95 to +111
async def process_input(self, text: str) -> None:
"""Process user input and trigger copilot response."""
if not text.strip():
return

if text.startswith("/"):
await self._handle_command(text)
else:
self.streaming = True
self.cancel_stream = False
self.conversation.add_analyst_message(text)
try:
await self.conversation.stream_copilot_response(text)
finally:
self.streaming = False

self.value = ""
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CopilotInput.process_input() is never wired to an actual Textual event (e.g. Input.Submitted). As a result, pressing Enter in the input won’t trigger copilot actions. Follow the pattern used in other screens (on_input_submitted) or override the appropriate handler in the Input subclass to call process_input(...).

Copilot uses AI. Check for mistakes.
Comment on lines +209 to +216
def action_cancel_stream(self) -> None:
"""Cancel ongoing LLM streaming."""
input_widget = self.query_one("#input", CopilotInput)
if input_widget.streaming:
input_widget.cancel_stream = True
history = self.query_one("#history", CopilotConversation)
history.add_system_message("Stream cancelled by analyst")

Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ctrl+C sets cancel_stream = True, but nothing in stream_copilot_response() (or the agent session) checks this flag or cancels the in-flight task. That means “Cancel Stream” will only print a system message and won’t stop the LLM call. Track the running task and cancel it, or check a cancellation event/flag during token streaming.

Copilot uses AI. Check for mistakes.
Comment on lines +15 to +16
from gnat.agents.copilot_investigation import InvestigationCopilotSession, CopilotPhase
from gnat.agents.conversations import ConversationStore
Copy link

Copilot AI Apr 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CopilotPhase and ConversationStore are imported but not used. With ruff (F401) enabled, unused imports will fail CI; remove them or use them (e.g., if the workflow needs to persist state via the store).

Suggested change
from gnat.agents.copilot_investigation import InvestigationCopilotSession, CopilotPhase
from gnat.agents.conversations import ConversationStore
from gnat.agents.copilot_investigation import InvestigationCopilotSession

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants