ToxMCP
diff --git a/‎README.md‎
Lines changed: 2 additions & 1 deletion b/‎README.md‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎docs/contracts/contract_manifest.json‎
Lines changed: 4 additions & 0 deletions b/‎docs/contracts/contract_manifest.json‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎docs/deployment_hardening.md‎
Lines changed: 14 additions & 0 deletions b/‎docs/deployment_hardening.md‎
Lines changed: 14 additions & 0 deletions
diff --git a/‎docs/http_audit_operations.md‎
Lines changed: 46 additions & 0 deletions b/‎docs/http_audit_operations.md‎
Lines changed: 46 additions & 0 deletions
diff --git a/‎docs/operator_guide.md‎
Lines changed: 4 additions & 0 deletions b/‎docs/operator_guide.md‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎docs/releases/v0.2.0.release_metadata.json‎
Lines changed: 1 addition & 0 deletions b/‎docs/releases/v0.2.0.release_metadata.json‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎scripts/replay_http_audit.py‎
Lines changed: 46 additions & 0 deletions b/‎scripts/replay_http_audit.py‎
Lines changed: 46 additions & 0 deletions
diff --git a/‎src/exposure_scenario_mcp/contracts.py‎
Lines changed: 8 additions & 0 deletions b/‎src/exposure_scenario_mcp/contracts.py‎
Lines changed: 8 additions & 0 deletions
diff --git a/‎src/exposure_scenario_mcp/guidance.py‎
Lines changed: 68 additions & 0 deletions b/‎src/exposure_scenario_mcp/guidance.py‎
Lines changed: 68 additions & 0 deletions
@@ -307,6 +307,7 @@ left out of the public overview until the downstream consumer ships.
 - `docs://defaults-curation-report`
 - `docs://operator-guide`
 - `docs://deployment-hardening-guide`
+- `docs://http-audit-operations-guide`
 - `docs://provenance-policy`
 - `docs://result-status-semantics`
 - `docs://uncertainty-framework`
@@ -404,7 +405,7 @@ runtime defaults/registries, and verifies a representative core/integration/work
 Current published surface from `docs/contracts/contract_manifest.json`:
 
 - `39` tools
-- `65` resources
+- `66` resources
 - `7` prompts
 - `153` schemas
 - `98` examples
 
@@ -300,6 +300,10 @@
       "uri": "docs://deployment-hardening-guide",
       "description": "Guide to hardening remote streamable-http deployments with external auth, TLS, origin controls, and logging."
     },
+    {
+      "uri": "docs://http-audit-operations-guide",
+      "description": "Guide to retaining, replaying, and debugging streamable-http audit events."
+    },
     {
       "uri": "docs://provenance-policy",
       "description": "Provenance and assumption-emission policy for auditability."
 
@@ -24,6 +24,10 @@ controls first and then layer gateway controls on top.
   request-level JSONL audit trail without persisting raw bodies.
 - Keep the default request timeout and concurrency ceiling unless benchmark evidence shows a
   reviewed need to widen them.
+- Rotate audit JSONL externally with `logrotate`, container log rotation, or explicit
+  per-day/per-release file paths. The server intentionally appends and never truncates in-process.
+- Retain enough JSONL history to cover incident review, release rollback, and benchmark drift
+  analysis. A reviewed 30- to 90-day retention window is a reasonable default for remote HTTP.
 - Treat `0` as an explicit opt-out of the in-process request-size limit, not the default.
 
 ## Recommended reverse-proxy posture
@@ -35,6 +39,16 @@ controls first and then layer gateway controls on top.
 - Set gateway-level request timeouts (suggest 30–60 s for screening tools, 120 s for envelope or probability-bound builds)
 - Capture structured access logs with timestamps and client identity
 
+## Replay and forensic workflow
+
+- Capture the `X-Exposure-Audit-Request-Id` response header from the calling client or access log.
+- Use `python scripts/replay_http_audit.py /path/to/http-audit.jsonl --request-id <id>` to isolate
+  a single exchange without storing the raw request body.
+- Use `python scripts/replay_http_audit.py /path/to/http-audit.jsonl --input-digest <sha256>` to
+  group logically identical JSON-RPC requests that differ only in key order or redacted secrets.
+- Match the event defaults and release fingerprints against `defaults://manifest` and
+  `release://metadata-report` before replaying a scenario downstream.
+
 ## Request and execution guardrails
 
 - Input schemas enforce `max_length` on the highest-volume list fields (e.g. aggregate component scenarios, evidence reconciliation records).
 
@@ -0,0 +1,46 @@
+# HTTP Audit Operations Guide
+
+Use this guide when a remote `streamable-http` deployment needs retention planning, replay,
+or request-level debugging without persisting raw request bodies.
+
+## What every event carries
+
+- `requestId`: echoed back through `X-Exposure-Audit-Request-Id` for incident correlation.
+- `normalizedInputDigestSha256`: stable digest over a redacted, canonical JSON request payload
+  that ignores the top-level JSON-RPC `id`.
+- `outputDigestSha256`: digest over the JSON-RPC response body for change detection.
+- `qualityFlagCodes`, `limitationCodes`, and `manualReviewRequired`: the high-signal trust surface
+  needed for screening review without reopening the whole tool payload first.
+- `reproducibility.defaultsVersion` and `reproducibility.defaultsHashSha256`: the exact defaults
+  pack fingerprint needed to confirm replay compatibility.
+- `reproducibility.releaseVersion`, `reproducibility.releaseMetadataPath`, and
+  `reproducibility.releaseMetadataSha256`: the release snapshot that should match
+  `release://metadata-report` or the checked-in release metadata file.
+
+## Retention and rotation
+
+- Treat the JSONL sink as append-only application evidence, not as a transient debug log.
+- Rotate externally with host tooling such as `logrotate`, container log rotation, or explicit
+  per-day/per-release paths.
+- Keep write permissions narrow because the audit file becomes part of the operational evidence
+  trail for HTTP requests.
+- Retain enough history to support incident review, release rollback checks, and benchmark drift
+  investigation. A reviewed 30- to 90-day retention window is a reasonable default.
+
+## Replay workflow
+
+```bash
+python scripts/summarize_http_audit.py /path/to/http-audit.jsonl
+python scripts/replay_http_audit.py /path/to/http-audit.jsonl --request-id <request-id>
+python scripts/replay_http_audit.py /path/to/http-audit.jsonl --input-digest <sha256>
+```
+
+## Reproducibility checklist
+
+1. Match `requestId` to the client-visible response header.
+2. Match `defaultsVersion` and `defaultsHashSha256` to `defaults://manifest`.
+3. Match `releaseVersion` and the release metadata fields to `release://metadata-report`.
+4. Confirm `qualityFlagCodes`, `limitationCodes`, and `manualReviewRequired` still align with the
+   downstream interpretation you plan to make.
+5. Treat `normalizedInputDigestSha256` as an equivalence key for redacted replay, not as a
+   substitute for validated scenario inputs.
@@ -37,4 +37,8 @@ uv run generate-exposure-contracts
   reviewed workload needs longer execution, and only widen concurrency deliberately.
 - Follow `docs/release_runbook.md` for releases and `docs/maintainer_operating_model.md` for the
   monthly triage and release-buddy cadence.
+- Use `python scripts/summarize_http_audit.py <path>` for fleet-level counts and
+  `python scripts/replay_http_audit.py <path> --request-id <id>` for request-level debugging.
+- Keep `docs://http-audit-operations-guide` available to operators who need to trace a result
+  back to a defaults manifest and release metadata snapshot.
 - Keep downstream orchestration-layer and PBPK handoffs explicit; do not add hidden transformation logic in clients.
@@ -80,6 +80,7 @@
     "docs://release-readiness",
     "docs://release-trust-checklist",
     "docs://deployment-hardening-guide",
+    "docs://http-audit-operations-guide",
     "docs://security-provenance-review",
     "docs://test-evidence-summary",
     "docs://verification-summary",
 
@@ -0,0 +1,46 @@
+"""Filter and replay JSONL audit events emitted by streamable-http deployments."""
+
+from __future__ import annotations
+
+import argparse
+import json
+from pathlib import Path
+
+from exposure_scenario_mcp.http_audit import (
+    build_http_audit_replay_report,
+    load_http_audit_events,
+)
+
+
+def main(argv: list[str] | None = None) -> int:
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument("path", type=Path, help="Path to the JSONL audit log.")
+    parser.add_argument(
+        "--request-id", help="Filter to a specific requestId from the response header."
+    )
+    parser.add_argument(
+        "--input-digest",
+        help="Filter to a normalizedInputDigestSha256 value to group equivalent requests.",
+    )
+    parser.add_argument("--operation", help="Filter to a specific tool, prompt, or resource name.")
+    parser.add_argument(
+        "--latest",
+        type=int,
+        help="Keep only the latest N matching events after other filters are applied.",
+    )
+    args = parser.parse_args(argv)
+
+    events = load_http_audit_events(args.path)
+    report = build_http_audit_replay_report(
+        events,
+        request_id=args.request_id,
+        normalized_input_digest=args.input_digest,
+        operation_name=args.operation,
+        latest=args.latest,
+    )
+    print(json.dumps(report, indent=2, sort_keys=True))
+    return 0
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
@@ -792,6 +792,12 @@ def build_contract_manifest(defaults_registry: DefaultsRegistry) -> ContractMani
                     "auth, TLS, origin controls, and logging."
                 ),
             ),
+            ContractResourceEntry(
+                uri="docs://http-audit-operations-guide",
+                description=(
+                    "Guide to retaining, replaying, and debugging streamable-http audit events."
+                ),
+            ),
             ContractResourceEntry(
                 uri="docs://provenance-policy",
                 description="Provenance and assumption-emission policy for auditability.",
@@ -1453,6 +1459,7 @@ def build_security_provenance_review_report(
             references=[
                 "docs://operator-guide",
                 "docs://deployment-hardening-guide",
+                "docs://http-audit-operations-guide",
                 "docs://troubleshooting",
             ],
         ),
@@ -1545,6 +1552,7 @@ def build_release_metadata_report(defaults_registry: DefaultsRegistry) -> Releas
             "docs://release-readiness",
             "docs://release-trust-checklist",
             "docs://deployment-hardening-guide",
+            "docs://http-audit-operations-guide",
             "docs://security-provenance-review",
             "docs://test-evidence-summary",
             "docs://verification-summary",
 
@@ -272,6 +272,10 @@ def operator_guide() -> str:
   monthly triage and release-buddy cadence.
 - Keep downstream orchestration-layer and PBPK handoffs explicit;
   do not add hidden transformation logic in clients.
+- Use `python scripts/summarize_http_audit.py <path>` for fleet-level counts and
+  `python scripts/replay_http_audit.py <path> --request-id <id>` for request-level debugging.
+- Keep `docs://http-audit-operations-guide` available to operators who need to trace a result
+  back to a defaults manifest and release metadata snapshot.
 """
 
 
@@ -346,9 +350,23 @@ def deployment_hardening_guide() -> str:
   request-level JSONL audit trail without persisting raw bodies.
 - Keep the default request timeout and concurrency ceiling unless you have benchmark evidence that
   a wider setting is necessary for your deployment.
+- Rotate audit JSONL externally with `logrotate`, container log rotation, or explicit
+  per-day/per-release file paths. The server intentionally appends and never truncates in-process.
+- Retain enough JSONL history to cover incident review, release rollback, and benchmark drift
+  analysis. A reviewed 30- to 90-day retention window is a reasonable default for remote HTTP.
 - Re-run release verification after transport or deployment changes.
 - Keep warning-level security and provenance findings visible to downstream users.
 
+## Replay and forensic workflow
+
+- Capture the `X-Exposure-Audit-Request-Id` response header from the calling client or access log.
+- Use `python scripts/replay_http_audit.py /path/to/http-audit.jsonl --request-id <id>` to isolate
+  a single exchange without storing the raw request body.
+- Use `python scripts/replay_http_audit.py /path/to/http-audit.jsonl --input-digest <sha256>` to
+  group logically identical JSON-RPC requests that differ only in key order or redacted secrets.
+- Match the event defaults and release fingerprints against `defaults://manifest` and
+  `release://metadata-report` before replaying a scenario downstream.
+
 ## Boundary note
 
 This MCP now provides first-party bearer-token auth, explicit origin allow-list enforcement, and
@@ -360,6 +378,56 @@ def deployment_hardening_guide() -> str:
 """
 
 
+def http_audit_operations_guide() -> str:
+    return """# HTTP Audit Operations Guide
+
+Use this guide when a remote `streamable-http` deployment needs retention planning, replay,
+or request-level debugging without persisting raw request bodies.
+
+## What every event carries
+
+- `requestId`: echoed back through `X-Exposure-Audit-Request-Id` for incident correlation.
+- `normalizedInputDigestSha256`: stable digest over a redacted, canonical JSON request payload
+  that ignores the top-level JSON-RPC `id`.
+- `outputDigestSha256`: digest over the JSON-RPC response body for change detection.
+- `qualityFlagCodes`, `limitationCodes`, and `manualReviewRequired`: the high-signal trust surface
+  needed for screening review without reopening the whole tool payload first.
+- `reproducibility.defaultsVersion` and `reproducibility.defaultsHashSha256`: the exact defaults
+  pack fingerprint needed to confirm replay compatibility.
+- `reproducibility.releaseVersion`, `reproducibility.releaseMetadataPath`, and
+  `reproducibility.releaseMetadataSha256`: the release snapshot that should match
+  `release://metadata-report` or the checked-in release metadata file.
+
+## Retention and rotation
+
+- Treat the JSONL sink as append-only application evidence, not as a transient debug log.
+- Rotate externally with host tooling such as `logrotate`, container log rotation, or explicit
+  per-day/per-release paths.
+- Keep write permissions narrow because the audit file becomes part of the operational evidence
+  trail for HTTP requests.
+- Retain enough history to support incident review, release rollback checks, and benchmark drift
+  investigation. A reviewed 30- to 90-day retention window is a reasonable default.
+
+## Replay workflow
+
+```bash
+python scripts/summarize_http_audit.py /path/to/http-audit.jsonl
+python scripts/replay_http_audit.py /path/to/http-audit.jsonl --request-id <request-id>
+python scripts/replay_http_audit.py /path/to/http-audit.jsonl --input-digest <sha256>
+```
+
+## Reproducibility checklist
+
+1. Match `requestId` to the client-visible response header.
+2. Match `defaultsVersion` and `defaultsHashSha256` to `defaults://manifest`.
+3. Match `releaseVersion` and the release metadata fields to `release://metadata-report`.
+4. Confirm `qualityFlagCodes`, `limitationCodes`, and `manualReviewRequired` still align with the
+   downstream interpretation you plan to make.
+5. Treat `normalizedInputDigestSha256` as an equivalence key for redacted replay, not as a
+   substitute for validated scenario inputs.
+"""
+
+
 def test_evidence_summary_guide() -> str:
     return """# Test Evidence Summary