Skip to content

Commit b248677

Browse files
authored
Codex/live concordance scientific validation (#28)
* Add scientific validation and live concordance panel * Automate scientific validation workflows * Release v0.2.3: audit hardening, privacy controls, provenance, and workflow governance Audit & Privacy: - Audit events now carry tamper-evident metadata (contentHash, previousHash, sequence, timestamp) with verify_event_hash() support. - Sensitive identifiers (DTXSID, CASRN, SMILES, InChI, InChIKey) are hashed before audit logging via _scrub_params_for_audit(). Provenance & Traceability: - BaseResource captures response_hash, retrieved_at, and retry_count in get_last_provenance(). - AuditBundleStore links bundles into a chain and supports verify_chain(). - HTTP transport extracts/generates W3C traceId and propagates it through audit events. - Orchestrator bundles include a provenance envelope with serverVersion, runtimeEnvironment, traceId, createdAt, and upstreamProvenance. Workflow Governance: - GenRAOrchestrator defaults require_ad_clearance=True when predictive tasks exist; explicit False is still respected. - Hard AD failures map bundle status to 'denied' instead of 'error'. - Advisory reviewCheckpoints metadata added to every bundle. Tests: - test_audit_hardening.py, test_audit_privacy.py, test_provenance_capture.py, test_trace_propagation.py, test_bundle_provenance.py, test_orchestrator_ad_gating.py Also includes pre-existing live-concordance reference-value drift checks.
1 parent ea95178 commit b248677

62 files changed

Lines changed: 16818 additions & 33 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

AUDIT_MCP_COVERAGE_2026-03-18.md

Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# EPA CompTox MCP coverage audit
2+
3+
Date: `2026-03-18`
4+
5+
Target:
6+
- `http://127.0.0.1:8002/mcp`
7+
8+
## Executive summary
9+
10+
After the HTTP catalog patch, the live MCP now advertises the full catalog over `tools/list`:
11+
12+
- total tools: `79`
13+
- `nextCursor`: `null`
14+
15+
Current resource coverage by family:
16+
17+
| Resource family | Tool count |
18+
| --- | ---: |
19+
| `chemical` | 10 |
20+
| `bioactivity` | 14 |
21+
| `exposure` | 32 |
22+
| `hazard` | 18 |
23+
| `chemical_list` | 2 |
24+
| `metadata` | 3 |
25+
| `cheminformatics` | 0 |
26+
27+
## What is covered
28+
29+
The current MCP catalog covers the major CTX dashboard data families represented in this repository:
30+
31+
- chemical discovery and detail lookup
32+
- bioactivity assays, assay chemicals, AED, and AOP lookups
33+
- exposure datasets including `HTTK`, `CPDat`, `SEEM`, `MMDB`, functional use, and CCD
34+
- hazard datasets including `ToxValDB`, `ToxRefDB`, cancer, genetox, `ADME/IVIVE`, `IRIS`, `PPRTV`, and `HAWC`
35+
- public chemical lists
36+
- metadata and applicability-domain assets
37+
38+
Representative live-discovery checks after the patch:
39+
40+
- `search_hazard`: present
41+
- `get_hazard_adme_ivive`: present
42+
- `get_hazard_toxref`: present
43+
- `get_bioactivity_aed`: present
44+
- `search_httk`: present
45+
46+
## What is not covered or not yet surfaced
47+
48+
### 1. Predictive services are not part of the live MCP catalog
49+
50+
The repository contains predictive service code (`GenRA`, `OPERA`, `TEST` wrappers), but these are not currently advertised as MCP tools in the live `79`-tool catalog.
51+
52+
Interpretation:
53+
- CTX dashboard-style data access is broadly covered.
54+
- Predictive micro-services exist in the codebase, but they are not yet exposed through the same MCP discovery surface.
55+
56+
### 2. `cheminformatics` currently contributes zero tools
57+
58+
The `cheminformatics` resource is initialized, but its current tool count is `0`.
59+
60+
Interpretation:
61+
- This is not blocking dashboard data access.
62+
- It is an obvious expansion point if cheminformatics operations are expected to be part of the MCP surface.
63+
64+
## Answer to “do we cover the entire dashboard?”
65+
66+
For the core CTX data tiers used by this server, coverage is strong:
67+
68+
- chemical: yes
69+
- bioactivity: yes
70+
- exposure: yes
71+
- hazard: yes
72+
- metadata/list assets: yes
73+
74+
Two qualifiers remain:
75+
76+
1. “Entire dashboard” is broader than the audited priority families and broader than the CTX API surface used in this repo.
77+
2. Predictive services and cheminformatics are not fully surfaced as MCP tools in the same way as the core CTX data families.
78+
79+
## Bottom line
80+
81+
If the goal is comprehensive MCP coverage of the main CTX dashboard data families, the server is now in good shape and the full catalog is discoverable over HTTP.
82+
83+
If the goal is literal “everything in the repo” or “everything a user may associate with the dashboard,” the remaining visible gaps are:
84+
85+
1. predictive services are not exposed as MCP tools
86+
2. cheminformatics contributes no live tools

AUDIT_MCP_ENDPOINTS_2026-03-18.md

Lines changed: 171 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,171 @@
1+
# EPA CompTox MCP audit
2+
3+
Date: `2026-03-18`
4+
5+
Target server:
6+
- `http://127.0.0.1:8002/mcp`
7+
8+
Audit scope:
9+
- MCP discovery via `tools/list`
10+
- Live tool execution for the priority data families:
11+
- `AED`
12+
- `HTTK`
13+
- `ADME/IVIVE`
14+
- Upstream API reachability using `scripts/check_endpoints.py`
15+
16+
## Executive summary
17+
18+
The live server on `8002` is functional for the priority data families. `AED`, `HTTK`, and `ADME/IVIVE` all returned real data for `DTXSID7020182` (Bisphenol A).
19+
20+
This audit initially surfaced two issues:
21+
22+
1. HTTP `tools/list` only returned the first `50` tools, which hid part of the catalog.
23+
2. The chemical smoke checker used a stale probe URL and produced a false negative.
24+
25+
Both issues are now patched.
26+
27+
Post-fix state:
28+
29+
- HTTP `tools/list` returns the full `79`-tool catalog
30+
- `get_hazard_adme_ivive` is discoverable via `tools/list`
31+
- `scripts/check_endpoints.py --json` passes for chemical, hazard, exposure, and bioactivity when project env is loaded
32+
33+
## Discovery audit
34+
35+
Live `tools/list` now returns `79` tools with `nextCursor: null`.
36+
37+
Priority tool discovery status:
38+
39+
| Tool | In `tools/list` | Callable | Returns data |
40+
| --- | --- | --- | --- |
41+
| `get_bioactivity_aed` | Yes | Yes | Yes |
42+
| `search_httk` | Yes | Yes | Yes |
43+
| `get_exposure_httk` | Yes | Yes | Yes |
44+
| `get_hazard_adme_ivive` | Yes | Yes | Yes |
45+
46+
## Live MCP execution audit
47+
48+
Test substance:
49+
- `DTXSID7020182` (`Bisphenol A`)
50+
51+
### 1. AED
52+
53+
Tool:
54+
- `get_bioactivity_aed`
55+
56+
Observed result:
57+
- HTTP metadata status: `200`
58+
- Data type: `list`
59+
- Record count: `662`
60+
- Sample fields include:
61+
- `dtxsid`
62+
- `aeid`
63+
- `aedVal`
64+
- `aedType`
65+
- `httkModel`
66+
- `httkVersion`
67+
- `aedValUnit`
68+
69+
Conclusion:
70+
- Functional
71+
- Data-bearing
72+
- Suitable for real audit and downstream analysis
73+
74+
### 2. HTTK
75+
76+
Tools:
77+
- `search_httk`
78+
- `get_exposure_httk`
79+
80+
Observed result for both:
81+
- HTTP metadata status: `200`
82+
- Data type: `list`
83+
- Record count: `18`
84+
- Sample fields include:
85+
- `dtxsid`
86+
- `parameter`
87+
- `measured`
88+
- `predicted`
89+
- `model`
90+
- `species`
91+
- `percentile`
92+
93+
Sample parameter/model:
94+
- `Css`
95+
- `PBTK`
96+
97+
Conclusion:
98+
- Both HTTK tools are functional
99+
- Both return real HTTK rows
100+
- The two outputs are materially equivalent for this test substance
101+
102+
### 3. ADME/IVIVE
103+
104+
Tool:
105+
- `get_hazard_adme_ivive`
106+
107+
Observed result:
108+
- HTTP metadata status: `200`
109+
- Data type: `list`
110+
- Record count: `18`
111+
- Sample fields include:
112+
- `dtxsid`
113+
- `description`
114+
- `measured`
115+
- `predicted`
116+
- `unit`
117+
- `model`
118+
- `species`
119+
- `percentile`
120+
121+
Sample parameter:
122+
- `Clint`
123+
124+
Conclusion:
125+
- Functional
126+
- Data-bearing
127+
- Discoverable through the MCP catalog after the transport patch
128+
129+
## Upstream dependency audit
130+
131+
Command path:
132+
- `scripts/check_endpoints.py --json`
133+
134+
When run with project env loaded, the checker returns:
135+
136+
| Upstream endpoint | Status | Result |
137+
| --- | --- | --- |
138+
| `CTX Chemical API` | `200` | OK |
139+
| `CTX Hazard API` | `200` | OK |
140+
| `CTX Exposure API` | `200` | OK |
141+
| `CTX Bioactivity API` | `200` | OK |
142+
143+
Interpretation:
144+
- Chemical, hazard, exposure, and bioactivity upstreams are reachable and healthy enough for the tested MCP calls.
145+
- The checker now probes the chemical tier with `chemical/detail/search/by-dtxsid/DTXSID7020182`, which matches the live CTX path family used by the server.
146+
147+
## Remaining follow-up
148+
149+
### Finding 1: endpoint matrix documentation still points to `v1` roots
150+
151+
Severity:
152+
- Medium
153+
154+
Why it matters:
155+
- `docs/contracts/endpoint-matrix.md` documents `ctx-api/v1` base roots.
156+
- Direct probe tests against those base roots returned `404`, while the currently functioning CTX probe paths use the non-`v1` endpoint family.
157+
158+
Evidence:
159+
- `docs/contracts/endpoint-matrix.md` lists `https://comptox.epa.gov/ctx-api/v1/chemical` and analogous `v1` roots.
160+
- Direct probes against those base roots returned `404`.
161+
- The patched smoke checker and the live MCP succeed against non-`v1` CTX endpoint paths.
162+
163+
## Bottom line
164+
165+
For the priority areas requested in this audit:
166+
167+
- `AED`: pass
168+
- `HTTK`: pass
169+
- `ADME/IVIVE`: pass
170+
171+
The server retrieves real data for all three target families and now advertises the full catalog correctly over HTTP. The one remaining issue is documentation drift in `docs/contracts/endpoint-matrix.md`.
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
# MCP Family Live Coverage Audit (2026-03-18)
2+
3+
## Scope
4+
5+
- Server audited: `http://127.0.0.1:8002/mcp`
6+
- Discovery source: live MCP HTTP `tools/list` response
7+
- Goal: verify family-level runtime coverage for the exposed CompTox dashboard domains, with explicit proof for `AED`, `HTTK`, and `ADME/IVIVE`.
8+
9+
## Discovery summary
10+
11+
- Total advertised tools: `79`
12+
- `bioactivity`: `14` tools
13+
- `chemical`: `10` tools
14+
- `chemical_list`: `2` tools
15+
- `exposure`: `32` tools
16+
- `hazard`: `18` tools
17+
- `metadata`: `3` tools
18+
19+
## Representative live runtime checks
20+
21+
| Family | Representative tool | Input | `structuredContent.data` | Size | Result |
22+
| --- | --- | --- | --- | ---: | --- |
23+
| `chemical` | `get_chemical_details` | `{"identifier":"DTXSID7020182","id_type":"dtxsid","subset":"default"}` | `dict` | `74` | **PASS** |
24+
| `bioactivity` | `get_bioactivity_aed` | `{"dtxsid":"DTXSID7020182"}` | `list` | `662` | **PASS** |
25+
| `exposure` | `get_exposure_httk` | `{"dtxsid":"DTXSID7020182"}` | `list` | `18` | **PASS** |
26+
| `hazard` | `get_hazard_adme_ivive` | `{"dtxsid":"DTXSID7020182"}` | `list` | `18` | **PASS** |
27+
| `chemical_list` | `get_public_list_names` | `{}` | `list` | `8` | **PASS** |
28+
| `metadata` | `metadata_list_applicability_domain` | `{"limit":10}` | `dict` | `3` | **PASS** |
29+
30+
## Dashboard coverage mapping
31+
32+
| Dashboard area | MCP family | Runtime coverage | Notes |
33+
| --- | --- | --- | --- |
34+
| Chemical identity/detail | `chemical` | Covered | `get_chemical_details` returned a populated structured object and now also exposes `structuredContent.data`. |
35+
| AED / bioactivity | `bioactivity` | Covered | `get_bioactivity_aed` returned `662` rows for `DTXSID7020182`. |
36+
| HTTK / exposure | `exposure` | Covered | `get_exposure_httk` returned `18` rows for `DTXSID7020182`. |
37+
| ADME / IVIVE / hazard | `hazard` | Covered | `get_hazard_adme_ivive` returned `18` rows for `DTXSID7020182`. |
38+
| Chemical lists | `chemical_list` | Covered | `get_public_list_names` now returns `8` public list names; `get_full_list("CCL")` remained live throughout. |
39+
| Metadata / reference registries | `metadata` | Covered | `metadata_list_applicability_domain` returned `3` applicability-domain records and now also exposes `structuredContent.data`. |
40+
| Cheminformatics | not exposed | Not covered | No live MCP tools are currently advertised for this area. |
41+
42+
## Findings
43+
44+
- The priority scientific paths requested for the audit are live and returning data: `AED`, `HTTK`, and `ADME/IVIVE`.
45+
- Family-level dashboard coverage is now complete for all currently exposed MCP families: `chemical`, `bioactivity`, `exposure`, `hazard`, `chemical_list`, and `metadata` all have successful live runtime proof.
46+
- `chemical_list` discovery now works through the shared `ctxpy` client, so non-MCP callers and MCP callers use the same fallback behavior when the upstream enumeration endpoint returns `404`.
47+
- Client parsing is now normalized around `structuredContent.data` for both success and error responses, while preserving existing domain-specific top-level keys for backward compatibility.
48+
- No `cheminformatics` tools are currently exposed through MCP, so that dashboard area remains outside current interface coverage.
49+
50+
## Conclusion
51+
52+
The current MCP server is functionally usable across all exposed CompTox dashboard families relevant to this project. The remaining interface gap is not a runtime failure but a product-scope gap: `cheminformatics` is still not exported as live MCP tools.
53+
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
# MCP Patch Verification (2026-03-18)
2+
3+
## Scope
4+
5+
- Server: `http://127.0.0.1:8002/mcp`
6+
- Patch set:
7+
- restore `chemical_list.get_public_list_names`
8+
- normalize `structuredContent.data` across success and error responses
9+
10+
## Live verification results
11+
12+
### 1. `get_public_list_names` recovery
13+
14+
- Result: **PASS**
15+
- Runtime behavior: returns a non-error response with `structuredContent.data`
16+
- Returned count: `8`
17+
- Sample values: `CCL`, `CCL1`, `CPDAT`, `CPDATv2`, `CTD`
18+
- Implementation note: upstream CTX list-enumeration endpoint currently returns `404`, so the MCP now falls back to a maintained catalog of verified public list names while `get_full_list(list_name)` continues to use the live CTX API.
19+
20+
### 2. Dict-shaped success responses now expose `structuredContent.data`
21+
22+
- `get_chemical_details(DTXSID7020182)`
23+
- Result: **PASS**
24+
- `structuredContent.data`: present
25+
- Payload type: `dict`
26+
- `metadata_list_applicability_domain(limit=10)`
27+
- Result: **PASS**
28+
- `structuredContent.data`: present
29+
- Backward-compatible top-level keys preserved: `applicabilityDomains`, `nextCursor`, `metadata`
30+
31+
### 3. Error responses now expose `structuredContent.data`
32+
33+
- Probe: `get_chemical_details(DTXSID_NOT_REAL, id_type="dtxsid")`
34+
- Result: **PASS**
35+
- Error semantics preserved: `isError=true`
36+
- Normalization confirmed: `structuredContent.data = null`
37+
38+
## Outcome
39+
40+
The MCP now has a consistent client-facing parsing contract:
41+
- Success responses always expose `structuredContent.data`
42+
- Error responses expose `structuredContent.data = null`
43+
- Existing top-level domain-specific keys remain available for backward compatibility
44+
45+
## Files changed
46+
47+
- `/Volumes/Storage/topotox_space_relief_20260220/mcp_epacomp_tox/src/epacomp_tox/resources/chemical_list.py`
48+
- `/Volumes/Storage/topotox_space_relief_20260220/mcp_epacomp_tox/src/epacomp_tox/server.py`

CHANGELOG.md

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,16 @@
22

33
## [Unreleased]
44

5-
- tightened repository governance with support, intake, review-ownership, and dependency-automation hygiene
6-
- expanded release discipline around public-surface validation and metadata consistency checks
7-
- standardized GitHub workflow job names in preparation for required branch-protection status checks
8-
- upgraded GitHub Actions workflow dependencies toward Node 24-compatible versions and opted dependency review into the Node 24 runtime
9-
- pinned GitHub workflow actions to immutable SHAs, added CodeQL scanning, and added workflow-hardening regression coverage
10-
- added a release/workflow-dispatch pipeline that builds distributions, emits a CycloneDX SBOM artifact, and publishes signed provenance/SBOM attestations
11-
- documented online and offline verification of signed release provenance and SBOM attestations for downstream consumers
5+
## [0.2.3] - 2026-04-15
6+
7+
- hardened audit subsystem with SHA-256 content hashing, sequential chain linkage, and tamper-evident event verification
8+
- added privacy-aware audit parameter scrubbing for sensitive identifiers (DTXSID, CASRN, SMILES, InChI, InChIKey)
9+
- captured upstream response provenance in `BaseResource` with `response_hash`, `retrieved_at`, and `retry_count`
10+
- added W3C `traceparent` propagation in HTTP transport and injected runtime provenance into orchestrator bundles
11+
- hardened `AuditBundleStore` with bundle checksums, previous-bundle hash linkage, and chain integrity verification
12+
- defaulted AD clearance to `True` in orchestrator when predictive tasks exist, with explicit opt-out still supported
13+
- added advisory `reviewCheckpoints` metadata to orchestrator bundle outputs
14+
- kept the public MCP boundary unchanged; all changes are internal governance, privacy, and traceability improvements
1215

1316
## [0.2.2] - 2026-04-12
1417

0 commit comments

Comments
 (0)