Skip to content

Commit 2fe4915

Browse files
committed
Add HTTP audit replay and retention guidance
1 parent e73ae3c commit 2fe4915

15 files changed

Lines changed: 474 additions & 4 deletions

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -307,6 +307,7 @@ left out of the public overview until the downstream consumer ships.
307307
- `docs://defaults-curation-report`
308308
- `docs://operator-guide`
309309
- `docs://deployment-hardening-guide`
310+
- `docs://http-audit-operations-guide`
310311
- `docs://provenance-policy`
311312
- `docs://result-status-semantics`
312313
- `docs://uncertainty-framework`
@@ -404,7 +405,7 @@ runtime defaults/registries, and verifies a representative core/integration/work
404405
Current published surface from `docs/contracts/contract_manifest.json`:
405406

406407
- `39` tools
407-
- `65` resources
408+
- `66` resources
408409
- `7` prompts
409410
- `153` schemas
410411
- `98` examples

docs/contracts/contract_manifest.json

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -300,6 +300,10 @@
300300
"uri": "docs://deployment-hardening-guide",
301301
"description": "Guide to hardening remote streamable-http deployments with external auth, TLS, origin controls, and logging."
302302
},
303+
{
304+
"uri": "docs://http-audit-operations-guide",
305+
"description": "Guide to retaining, replaying, and debugging streamable-http audit events."
306+
},
303307
{
304308
"uri": "docs://provenance-policy",
305309
"description": "Provenance and assumption-emission policy for auditability."

docs/deployment_hardening.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,10 @@ controls first and then layer gateway controls on top.
2424
request-level JSONL audit trail without persisting raw bodies.
2525
- Keep the default request timeout and concurrency ceiling unless benchmark evidence shows a
2626
reviewed need to widen them.
27+
- Rotate audit JSONL externally with `logrotate`, container log rotation, or explicit
28+
per-day/per-release file paths. The server intentionally appends and never truncates in-process.
29+
- Retain enough JSONL history to cover incident review, release rollback, and benchmark drift
30+
analysis. A reviewed 30- to 90-day retention window is a reasonable default for remote HTTP.
2731
- Treat `0` as an explicit opt-out of the in-process request-size limit, not the default.
2832

2933
## Recommended reverse-proxy posture
@@ -35,6 +39,16 @@ controls first and then layer gateway controls on top.
3539
- Set gateway-level request timeouts (suggest 30–60 s for screening tools, 120 s for envelope or probability-bound builds)
3640
- Capture structured access logs with timestamps and client identity
3741

42+
## Replay and forensic workflow
43+
44+
- Capture the `X-Exposure-Audit-Request-Id` response header from the calling client or access log.
45+
- Use `python scripts/replay_http_audit.py /path/to/http-audit.jsonl --request-id <id>` to isolate
46+
a single exchange without storing the raw request body.
47+
- Use `python scripts/replay_http_audit.py /path/to/http-audit.jsonl --input-digest <sha256>` to
48+
group logically identical JSON-RPC requests that differ only in key order or redacted secrets.
49+
- Match the event defaults and release fingerprints against `defaults://manifest` and
50+
`release://metadata-report` before replaying a scenario downstream.
51+
3852
## Request and execution guardrails
3953

4054
- Input schemas enforce `max_length` on the highest-volume list fields (e.g. aggregate component scenarios, evidence reconciliation records).

docs/http_audit_operations.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
# HTTP Audit Operations Guide
2+
3+
Use this guide when a remote `streamable-http` deployment needs retention planning, replay,
4+
or request-level debugging without persisting raw request bodies.
5+
6+
## What every event carries
7+
8+
- `requestId`: echoed back through `X-Exposure-Audit-Request-Id` for incident correlation.
9+
- `normalizedInputDigestSha256`: stable digest over a redacted, canonical JSON request payload
10+
that ignores the top-level JSON-RPC `id`.
11+
- `outputDigestSha256`: digest over the JSON-RPC response body for change detection.
12+
- `qualityFlagCodes`, `limitationCodes`, and `manualReviewRequired`: the high-signal trust surface
13+
needed for screening review without reopening the whole tool payload first.
14+
- `reproducibility.defaultsVersion` and `reproducibility.defaultsHashSha256`: the exact defaults
15+
pack fingerprint needed to confirm replay compatibility.
16+
- `reproducibility.releaseVersion`, `reproducibility.releaseMetadataPath`, and
17+
`reproducibility.releaseMetadataSha256`: the release snapshot that should match
18+
`release://metadata-report` or the checked-in release metadata file.
19+
20+
## Retention and rotation
21+
22+
- Treat the JSONL sink as append-only application evidence, not as a transient debug log.
23+
- Rotate externally with host tooling such as `logrotate`, container log rotation, or explicit
24+
per-day/per-release paths.
25+
- Keep write permissions narrow because the audit file becomes part of the operational evidence
26+
trail for HTTP requests.
27+
- Retain enough history to support incident review, release rollback checks, and benchmark drift
28+
investigation. A reviewed 30- to 90-day retention window is a reasonable default.
29+
30+
## Replay workflow
31+
32+
```bash
33+
python scripts/summarize_http_audit.py /path/to/http-audit.jsonl
34+
python scripts/replay_http_audit.py /path/to/http-audit.jsonl --request-id <request-id>
35+
python scripts/replay_http_audit.py /path/to/http-audit.jsonl --input-digest <sha256>
36+
```
37+
38+
## Reproducibility checklist
39+
40+
1. Match `requestId` to the client-visible response header.
41+
2. Match `defaultsVersion` and `defaultsHashSha256` to `defaults://manifest`.
42+
3. Match `releaseVersion` and the release metadata fields to `release://metadata-report`.
43+
4. Confirm `qualityFlagCodes`, `limitationCodes`, and `manualReviewRequired` still align with the
44+
downstream interpretation you plan to make.
45+
5. Treat `normalizedInputDigestSha256` as an equivalence key for redacted replay, not as a
46+
substitute for validated scenario inputs.

docs/operator_guide.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,4 +37,8 @@ uv run generate-exposure-contracts
3737
reviewed workload needs longer execution, and only widen concurrency deliberately.
3838
- Follow `docs/release_runbook.md` for releases and `docs/maintainer_operating_model.md` for the
3939
monthly triage and release-buddy cadence.
40+
- Use `python scripts/summarize_http_audit.py <path>` for fleet-level counts and
41+
`python scripts/replay_http_audit.py <path> --request-id <id>` for request-level debugging.
42+
- Keep `docs://http-audit-operations-guide` available to operators who need to trace a result
43+
back to a defaults manifest and release metadata snapshot.
4044
- Keep downstream orchestration-layer and PBPK handoffs explicit; do not add hidden transformation logic in clients.

docs/releases/v0.2.0.release_metadata.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,7 @@
8080
"docs://release-readiness",
8181
"docs://release-trust-checklist",
8282
"docs://deployment-hardening-guide",
83+
"docs://http-audit-operations-guide",
8384
"docs://security-provenance-review",
8485
"docs://test-evidence-summary",
8586
"docs://verification-summary",

scripts/replay_http_audit.py

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
"""Filter and replay JSONL audit events emitted by streamable-http deployments."""
2+
3+
from __future__ import annotations
4+
5+
import argparse
6+
import json
7+
from pathlib import Path
8+
9+
from exposure_scenario_mcp.http_audit import (
10+
build_http_audit_replay_report,
11+
load_http_audit_events,
12+
)
13+
14+
15+
def main(argv: list[str] | None = None) -> int:
16+
parser = argparse.ArgumentParser(description=__doc__)
17+
parser.add_argument("path", type=Path, help="Path to the JSONL audit log.")
18+
parser.add_argument(
19+
"--request-id", help="Filter to a specific requestId from the response header."
20+
)
21+
parser.add_argument(
22+
"--input-digest",
23+
help="Filter to a normalizedInputDigestSha256 value to group equivalent requests.",
24+
)
25+
parser.add_argument("--operation", help="Filter to a specific tool, prompt, or resource name.")
26+
parser.add_argument(
27+
"--latest",
28+
type=int,
29+
help="Keep only the latest N matching events after other filters are applied.",
30+
)
31+
args = parser.parse_args(argv)
32+
33+
events = load_http_audit_events(args.path)
34+
report = build_http_audit_replay_report(
35+
events,
36+
request_id=args.request_id,
37+
normalized_input_digest=args.input_digest,
38+
operation_name=args.operation,
39+
latest=args.latest,
40+
)
41+
print(json.dumps(report, indent=2, sort_keys=True))
42+
return 0
43+
44+
45+
if __name__ == "__main__":
46+
raise SystemExit(main())

src/exposure_scenario_mcp/contracts.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -792,6 +792,12 @@ def build_contract_manifest(defaults_registry: DefaultsRegistry) -> ContractMani
792792
"auth, TLS, origin controls, and logging."
793793
),
794794
),
795+
ContractResourceEntry(
796+
uri="docs://http-audit-operations-guide",
797+
description=(
798+
"Guide to retaining, replaying, and debugging streamable-http audit events."
799+
),
800+
),
795801
ContractResourceEntry(
796802
uri="docs://provenance-policy",
797803
description="Provenance and assumption-emission policy for auditability.",
@@ -1453,6 +1459,7 @@ def build_security_provenance_review_report(
14531459
references=[
14541460
"docs://operator-guide",
14551461
"docs://deployment-hardening-guide",
1462+
"docs://http-audit-operations-guide",
14561463
"docs://troubleshooting",
14571464
],
14581465
),
@@ -1545,6 +1552,7 @@ def build_release_metadata_report(defaults_registry: DefaultsRegistry) -> Releas
15451552
"docs://release-readiness",
15461553
"docs://release-trust-checklist",
15471554
"docs://deployment-hardening-guide",
1555+
"docs://http-audit-operations-guide",
15481556
"docs://security-provenance-review",
15491557
"docs://test-evidence-summary",
15501558
"docs://verification-summary",

src/exposure_scenario_mcp/guidance.py

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -272,6 +272,10 @@ def operator_guide() -> str:
272272
monthly triage and release-buddy cadence.
273273
- Keep downstream orchestration-layer and PBPK handoffs explicit;
274274
do not add hidden transformation logic in clients.
275+
- Use `python scripts/summarize_http_audit.py <path>` for fleet-level counts and
276+
`python scripts/replay_http_audit.py <path> --request-id <id>` for request-level debugging.
277+
- Keep `docs://http-audit-operations-guide` available to operators who need to trace a result
278+
back to a defaults manifest and release metadata snapshot.
275279
"""
276280

277281

@@ -346,9 +350,23 @@ def deployment_hardening_guide() -> str:
346350
request-level JSONL audit trail without persisting raw bodies.
347351
- Keep the default request timeout and concurrency ceiling unless you have benchmark evidence that
348352
a wider setting is necessary for your deployment.
353+
- Rotate audit JSONL externally with `logrotate`, container log rotation, or explicit
354+
per-day/per-release file paths. The server intentionally appends and never truncates in-process.
355+
- Retain enough JSONL history to cover incident review, release rollback, and benchmark drift
356+
analysis. A reviewed 30- to 90-day retention window is a reasonable default for remote HTTP.
349357
- Re-run release verification after transport or deployment changes.
350358
- Keep warning-level security and provenance findings visible to downstream users.
351359
360+
## Replay and forensic workflow
361+
362+
- Capture the `X-Exposure-Audit-Request-Id` response header from the calling client or access log.
363+
- Use `python scripts/replay_http_audit.py /path/to/http-audit.jsonl --request-id <id>` to isolate
364+
a single exchange without storing the raw request body.
365+
- Use `python scripts/replay_http_audit.py /path/to/http-audit.jsonl --input-digest <sha256>` to
366+
group logically identical JSON-RPC requests that differ only in key order or redacted secrets.
367+
- Match the event defaults and release fingerprints against `defaults://manifest` and
368+
`release://metadata-report` before replaying a scenario downstream.
369+
352370
## Boundary note
353371
354372
This MCP now provides first-party bearer-token auth, explicit origin allow-list enforcement, and
@@ -360,6 +378,56 @@ def deployment_hardening_guide() -> str:
360378
"""
361379

362380

381+
def http_audit_operations_guide() -> str:
382+
return """# HTTP Audit Operations Guide
383+
384+
Use this guide when a remote `streamable-http` deployment needs retention planning, replay,
385+
or request-level debugging without persisting raw request bodies.
386+
387+
## What every event carries
388+
389+
- `requestId`: echoed back through `X-Exposure-Audit-Request-Id` for incident correlation.
390+
- `normalizedInputDigestSha256`: stable digest over a redacted, canonical JSON request payload
391+
that ignores the top-level JSON-RPC `id`.
392+
- `outputDigestSha256`: digest over the JSON-RPC response body for change detection.
393+
- `qualityFlagCodes`, `limitationCodes`, and `manualReviewRequired`: the high-signal trust surface
394+
needed for screening review without reopening the whole tool payload first.
395+
- `reproducibility.defaultsVersion` and `reproducibility.defaultsHashSha256`: the exact defaults
396+
pack fingerprint needed to confirm replay compatibility.
397+
- `reproducibility.releaseVersion`, `reproducibility.releaseMetadataPath`, and
398+
`reproducibility.releaseMetadataSha256`: the release snapshot that should match
399+
`release://metadata-report` or the checked-in release metadata file.
400+
401+
## Retention and rotation
402+
403+
- Treat the JSONL sink as append-only application evidence, not as a transient debug log.
404+
- Rotate externally with host tooling such as `logrotate`, container log rotation, or explicit
405+
per-day/per-release paths.
406+
- Keep write permissions narrow because the audit file becomes part of the operational evidence
407+
trail for HTTP requests.
408+
- Retain enough history to support incident review, release rollback checks, and benchmark drift
409+
investigation. A reviewed 30- to 90-day retention window is a reasonable default.
410+
411+
## Replay workflow
412+
413+
```bash
414+
python scripts/summarize_http_audit.py /path/to/http-audit.jsonl
415+
python scripts/replay_http_audit.py /path/to/http-audit.jsonl --request-id <request-id>
416+
python scripts/replay_http_audit.py /path/to/http-audit.jsonl --input-digest <sha256>
417+
```
418+
419+
## Reproducibility checklist
420+
421+
1. Match `requestId` to the client-visible response header.
422+
2. Match `defaultsVersion` and `defaultsHashSha256` to `defaults://manifest`.
423+
3. Match `releaseVersion` and the release metadata fields to `release://metadata-report`.
424+
4. Confirm `qualityFlagCodes`, `limitationCodes`, and `manualReviewRequired` still align with the
425+
downstream interpretation you plan to make.
426+
5. Treat `normalizedInputDigestSha256` as an equivalence key for redacted replay, not as a
427+
substitute for validated scenario inputs.
428+
"""
429+
430+
363431
def test_evidence_summary_guide() -> str:
364432
return """# Test Evidence Summary
365433

0 commit comments

Comments
 (0)