ATF author here, excited about what you've built #301

jman-msc · 2026-03-19T03:26:22Z

jman-msc
Mar 19, 2026

Hi Imran,

I'm Josh Woodruff, author of the Agentic Trust Framework and co-chair of the CSA Zero Trust Working Group.

I just found your CSA-ATF-PROPOSAL.md and wanted to reach out directly. You built full coverage across all five pillars, got this moved into the microsoft/ org, and formally proposed CSA working group engagement. All within 30 days of the blog post. That's impressive, and I appreciate it.

I fully support positioning the Agent Governance Toolkit as an ATF reference implementation. On your three proposed spec additions:

Agent delegation chain verification: You're right that this is a gap. I want to formalize this in v0.2.0.
AI-BOM integration: Directly relevant to the Data Governance pillar and to EU AI Act alignment. I'm very interested.
Trust scoring quantification: The maturity model has qualitative promotion criteria today. A quantified scoring methodology would make it much stronger.

I'd like to invite you to contribute to the ATF specification through the CSA Zero Trust Working Group. I'll be at RSAC 2026 (March 23-26 in San Francisco) and would love to meet in person if you're attending. If not, let's set up a call.

I also just ran a CSA webinar on ATF. Here's the recording if you want to see the current state of the spec and where it's heading: CSA ATF Webinar Recording

One more thing: I'm developing a formal ATF Conformance specification, defining what "ATF Compatible" means as a verifiable checklist based on the five elements. You've already mapped against all 15 requirements. I'd welcome your input as a co-author. Your implementation experience is exactly what keeps a conformance spec practical instead of theoretical.

Looking forward to connecting.

Josh Woodruff
Founder and CEO, MassiveScale.AI
CSA Research Fellow | IANS Faculty | Co-Chair, CSA Zero Trust Working Group
agentictrustframework.ai

imran-siddique · 2026-03-19T22:05:06Z

imran-siddique
Mar 19, 2026
Collaborator

@jman-msc — thank you for reaching out directly, and for the generous words. Finding the ATF spec back in February was one of those "someone already articulated the threat model we've been building against" moments. Your five-pillar structure gave us a clean compliance target, and mapping our toolkit against all 15 requirements validated that we'd been converging on the same architecture from the implementation side.

On the three spec additions

1. Agent delegation chain verification (v0.2.0)

This is the gap that shows up the moment you move from single-agent to multi-agent systems. Our AgentMesh package already implements this — cryptographically signed delegation chains where each hop in an agent-to-agent call carries:

The originating agent's DID identity
A signed delegation token with scoped permissions and TTL
A Merkle-chained audit trail so any node in the chain can verify provenance back to the root

We'd be happy to contribute our implementation patterns and test vectors as input for the v0.2.0 spec. The tricky design decision is how much of the chain each intermediate agent should be able to inspect (full chain vs. only its immediate delegator) — we currently support both modes and have opinions on when each is appropriate.

2. AI-BOM integration

We built a full AI Bill of Materials system in Agent OS that tracks model provenance, training data lineage, weight checksums, and dependency graphs. This maps directly to the Data Governance pillar and is the artifact we believe regulators will eventually require — the EU AI Act's transparency obligations effectively demand something like this.

The piece we think the ATF spec could formalize: a standard schema for AI-BOMs that governance frameworks can validate against. We have a working schema today; would be valuable to align it with what the CSA working group envisions.

3. Trust scoring quantification

The maturity model's qualitative promotion criteria (Intern → Junior → Senior → Principal) are the right abstraction. What we've added on the implementation side is a numeric trust scoring engine that feeds those promotion decisions:

Composite score (0-100) based on weighted signals: task success rate, policy compliance, anomaly frequency, time-in-tier
Automatic promotion/demotion thresholds with configurable hysteresis to prevent oscillation
Trust decay over inactivity periods

Making this quantified methodology part of the spec would give organizations a concrete, auditable basis for autonomy decisions instead of subjective judgment calls. Happy to share our scoring algorithms and the reasoning behind the weight defaults.

On the conformance specification

The co-author invitation is genuinely compelling — our implementation experience across all 15 requirements is exactly the kind of input that keeps conformance criteria practical. We're very interested in this and would like to understand the scope and timeline better before formalizing involvement. Would a call to walk through what you're envisioning for the conformance spec be a good starting point?

On RSAC

I won't be at RSAC this year unfortunately — but I'd very much like to connect. Would you be open to a call the week after RSAC (March 30 week) once you're back? Happy to do a screen-share walkthrough of the toolkit's ATF coverage, the delegation chain implementation, and the trust scoring engine. That might be more productive than a conference hallway conversation anyway.

On the webinar recording

Thank you for sharing — I'll watch it this week and may have follow-up questions on the spec direction, particularly around how conformance testing intersects with the maturity model tiers.

Genuinely excited about this convergence. We built the implementation; you wrote the specification; the alignment is too natural to not formalize. Looking forward to connecting.

Imran Siddique
Principal Software Engineering Manager, Microsoft

0 replies

jman-msc · 2026-03-20T21:25:27Z

jman-msc
Mar 20, 2026
Author

Hi @imran-siddique,

This is exactly the kind of response I was hoping for. Technical depth, specific implementation details, and clear opinions on the design tradeoffs. That's what makes a conformance spec work.

A few things worth calling out:

Delegation chains

The full-chain vs. immediate-delegator visibility question is one of the harder design decisions in v0.2.0. I have opinions on when each mode is appropriate too. I'd rather hash that out on a call than in a thread. Your test vectors and implementation patterns would be a huge input.

AI-BOM schema

Regulators will ask for this first, and most orgs won't have a good answer. If you already have a working schema validated against the EU AI Act transparency requirements, that saves months. Let's align it with what the CSA working group is building.

Trust scoring

The hysteresis to prevent promotion/demotion oscillation is a detail that only shows up after you've run this in production. The ATF maturity model uses MUST/SHOULD/MAY per requirement at each level (Intern through Principal), but promotion criteria between levels are still qualitative. A numeric scoring methodology would close that gap. I want implementation experience driving the spec here, not theory.

On conformance

I've published the ATF Conformance Specification. Two tiers: ATF Compatible (self-attestation against all 25 requirements) and ATF Certified (third-party audit, 90 days production operation, additional operational controls). There's a maturity level matrix that maps each requirement to MUST/SHOULD/MAY at each level. Your toolkit covers all 25 requirements across the five elements. I want your read on whether the conformance criteria are practical from an implementer's seat. That's the blind spot a spec author can't test alone.

We've also launched agentictrustframework.ai as the home for the spec and verifiedagents.ai for assessment and scoring. Both go well past the original blog post. Worth a look before our call.

On timing: week of March 30 works. I'll be back from RSAC with fresh context. A screen-share walkthrough of your ATF coverage, delegation chains, and trust scoring engine is the right format. I'll follow up after RSAC to lock in a time.

Thanks,
-Josh

0 replies

imran-siddique · 2026-03-23T03:21:10Z

imran-siddique
Mar 23, 2026
Collaborator

Josh — excellent, this is moving fast in the right direction.

Conformance spec review

I read through CONFORMANCE.md. A few notes from the implementer's seat:

What works well:

The two-tier model (Compatible vs Certified) is the right structure. Self-attestation as the entry ramp, third-party audit for production — that mirrors SOC 2 Type I vs Type II, which enterprises already understand.
The 25-requirement matrix with MUST/SHOULD/MAY per maturity level gives implementers clear targets. Our toolkit maps against all 25 — I'll prepare a formal mapping table for our call.
The 90-day production operation requirement for Certified is smart. It filters out paper compliance.

Feedback for consideration:

Delegation chain verification (your v0.2.0 addition) should be a MUST at the Senior/Principal level. Without it, multi-agent systems can't prove authorization provenance. I'd argue it's the single biggest gap for enterprise adoption.
AI-BOM should have a MUST requirement for the model identifier and training data hash at minimum. Full provenance is SHOULD. This aligns with EU AI Act Art. 13 technical documentation requirements.
Trust score quantification: the conformance spec should define minimum scoring signals (task success rate, policy compliance, anomaly frequency) even if the weights are implementation-specific. Otherwise "trust scoring" becomes a checkbox without substance.

I'll bring a detailed requirement-by-requirement assessment to the call.

New sites

Checked out agentictrustframework.ai and verifiedagents.ai — significant upgrade from the GitHub-only presence. The assessment model on verifiedagents.ai is exactly what enterprises will ask for when evaluating agent platforms.

Call logistics

Week of March 30 works perfectly. I'll send a calendar invite — suggest 60 minutes for:

ATF conformance walkthrough (your side — 15 min)
AGT implementation demo (my side — 25 min): delegation chains, trust scoring engine, AI-BOM schema
Discussion: v0.2.0 spec additions + conformance criteria feedback (20 min)

What timezone works best for you?

Looking forward to it.
— Imran

0 replies

imran-siddique · 2026-03-23T03:25:27Z

imran-siddique
Mar 23, 2026
Collaborator

Quick update on logistics — my schedule is packed the next few weeks with upcoming OOF, so a live call may be hard to fit. Would you be open to doing this async instead?

I'm thinking: I'll prepare a detailed requirement-by-requirement conformance assessment as a document (our ATF coverage, delegation chain implementation details, trust scoring algorithms, AI-BOM schema) and share it here or as a PR against the ATF repo. You can review at your pace, leave comments, and we iterate on the doc. That way we both get the depth of a call without the scheduling overhead.

If there are specific design decisions that need real-time discussion (like the delegation chain visibility question), we can do a focused 30-min call just for those — but let's see if the doc gets us 90% there first.

Let me know if that works.

0 replies

jman-msc · 2026-04-02T22:51:06Z

jman-msc
Apr 2, 2026
Author

Imran, congrats on the public launch. Seven packages, five language SDKs, 9,500+ tests, 20 tutorials, and framework integrations already shipping. That's serious work.

Async works well for me. A few things, picking up where we left off.

Your conformance feedback

I've been sitting with your notes on CONFORMANCE.md and I agree on all three.

Delegation chains as MUST at Senior/Principal. Yes. Multi-agent systems without verifiable authorization provenance are a non-starter at those maturity levels. The maturity level matrix in CONFORMANCE.md already defines MUST/SHOULD/MAY per requirement at each level. Delegation chain verification will follow that same structure in the next spec revision.

AI-BOM: model identifier + training data hash as MUST, full provenance as SHOULD. That's the right split. It matches how EU AI Act Art. 13 will land in practice. Organizations can hit the floor quickly and build toward full provenance over time. Adopting this.

Trust scoring minimum signals. You're right that without defined signals, "trust scoring" becomes a checkbox. I'll specify task success rate, policy compliance rate, and anomaly frequency as minimum required inputs. Weight selection stays implementation-specific. That gives conformance teeth without locking out different scoring approaches, including your 0-100 composite score model.

The async conformance assessment

Since we last spoke, the spec has moved to v0.9.0 with several new documents that should make the mapping easier:

CROSSWALKS.md maps ATF to CSA AICM, OWASP Agentic Top 10, NIST 800-207/AI RMF, ISO 42001, and ISO 27001
IMPLEMENTATION_PATTERNS.md has technology-agnostic patterns per element with maturity indicators
MATURITY_MODEL.md documents promotion gates, demotion criteria, and the operating model

Your three conformance recommendations will land in the next point release.

If you're still up for it, a PR with your toolkit's coverage mapped against all 25 requirements would be the first formal conformance document from any implementation. That carries weight. I can set up a conformance/ directory in the ATF repo with a template if that helps.

The delegation chain design question

The full-chain vs. immediate-delegator visibility question is one that needs real-time discussion. I have a position on when each mode applies, and I'd rather debate it live than in a thread. When your schedule opens up, I'd like to find 30 minutes for that one.

ATF spec (v0.9.0): https://github.com/massivescale-ai/agentic-trust-framework
Conformance spec: https://github.com/massivescale-ai/agentic-trust-framework/blob/main/CONFORMANCE.md
Crosswalks: https://github.com/massivescale-ai/agentic-trust-framework/blob/main/CROSSWALKS.md

0 replies

imran-siddique · 2026-04-05T16:27:29Z

imran-siddique
Apr 5, 2026
Collaborator

Josh -- thanks for the update and congrats feedback on our launch.

Quick status on the conformance assessment:

Since your last message we shipped v3.0.2 with several changes directly relevant to ATF alignment:

Delegation chains -- delegation_depth tracking with MAX_DELEGATION_DEPTH, parent DID verification, and capability narrowing (child cannot exceed parent caps). Test vectors ready to share.
Trust scoring -- canonical TrustScore/TrustTracker/AgentProfile now in agentmesh.trust_types (single source, was duplicated across 6 integrations). Our 0-1000 scale maps to your tiers.
Content governance -- new ContentQualityEvaluator with 6-dimension scoring (accuracy, completeness, freshness, structure, relevance, consistency). Separate from trust/compliance -- maps to your quality vs safety distinction.
Execution context -- ContextualPolicyEngine with per-context enforcement (inner_loop/ci_cd/autonomous). Policies behave differently depending on where they run.
Deployment runtime -- DockerDeployer and KubernetesDeployer with security hardening (cap-drop ALL, runAsNonRoot, seccomp). Ready for AKS.

I will prepare the full requirement-by-requirement assessment against CONFORMANCE.md and share as a PR against the ATF repo. Targeting next week.

On delegation chain visibility -- agree we should hash out full-chain vs immediate-delegator modes. Our current implementation exposes parent_did (immediate) but can walk the chain via recursive lookup. Happy to document both modes with pros/cons in the assessment.

0 replies

imran-siddique · 2026-04-05T19:00:58Z

imran-siddique
Apr 5, 2026
Collaborator

Josh -- the conformance assessment is ready.

Published at: https://github.com/microsoft/agent-governance-toolkit/blob/main/docs/compliance/atf-conformance-assessment.md

Results against ATF v0.9.0:

25/25 requirements addressed
18 fully met, 7 partially met, 0 not met
Target maturity: Senior

7 gaps documented with specific code citations and recommended fixes:

I-4 Purpose Declaration -- no unified PurposeDeclaration model
B-2 Action Attribution -- inconsistent agent_id vs agent_did naming
B-3 Behavioral Baseline -- in-memory only, no cross-session persistence
D-3 PII/PHI Protection -- regex-only, no ML-based NER
D-5 Data Lineage -- execution-trace only, no dataset-level provenance
R-5 Graceful Degradation -- scattered fallbacks, no unified autonomy controller
.NET SDK uses HMAC fallback instead of Ed25519

Every requirement includes file paths, class names, and how the implementation works. Happy to iterate on any section. If you want, I can also submit this as a PR against the ATF repo with a conformance-statements/ directory.

1 reply

jman-msc Apr 9, 2026
Author

Hi Imran, this is outstanding work.

On the v3.0.2 updates:

The delegation depth tracking with capability narrowing and the canonical TrustScore consolidation are both directly relevant to spec changes queued for the next point release. Good timing.

On the conformance assessment:

This is the first formal conformance document produced against ATF by any implementation. The file-path-level citations against each requirement set the bar for what these should look like. Please submit the PR with a conformance-statements/ directory. That's the right structure.

On the 7 gaps:

several align with spec changes already accepted (delegation chain verification as MUST at Senior/Principal, AI-BOM provenance, trust scoring minimum signals). Your assessment sharpens the priority on the rest. I'll reference specific gap items when those changes land.

On delegation chain visibility:

Your implementation already maps to where I've landed on this. I think the answer is a two-audience model. The governance/supervisory plane MUST always have full-chain visibility, no exceptions. The executing agent sees its immediate delegator only as the default. Your parent_did (immediate) plus recursive lookup (full chain available to governance) is exactly that pattern.

The open question is whether the spec should define a MAY for exposing additional chain context to higher-maturity agents, or whether that's an implementation decision we document as a pattern.

That's the 30 minutes I want. I'll DM you to find a slot.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ATF author here, excited about what you've built #301

Uh oh!

{{title}}

Uh oh!

Replies: 7 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

ATF author here, excited about what you've built #301

Uh oh!

jman-msc Mar 19, 2026

Replies: 7 comments · 1 reply

Uh oh!

imran-siddique Mar 19, 2026 Collaborator

On the three spec additions

On the conformance specification

On RSAC

On the webinar recording

Uh oh!

jman-msc Mar 20, 2026 Author

Delegation chains

AI-BOM schema

Trust scoring

On conformance

Uh oh!

imran-siddique Mar 23, 2026 Collaborator

Conformance spec review

New sites

Call logistics

Uh oh!

imran-siddique Mar 23, 2026 Collaborator

Uh oh!

Uh oh!

jman-msc Apr 2, 2026 Author

Uh oh!

imran-siddique Apr 5, 2026 Collaborator

Uh oh!

imran-siddique Apr 5, 2026 Collaborator

Uh oh!

jman-msc Apr 9, 2026 Author

On the v3.0.2 updates:

On the conformance assessment:

On the 7 gaps:

On delegation chain visibility:

jman-msc
Mar 19, 2026

Replies: 7 comments 1 reply

imran-siddique
Mar 19, 2026
Collaborator

jman-msc
Mar 20, 2026
Author

imran-siddique
Mar 23, 2026
Collaborator

imran-siddique
Mar 23, 2026
Collaborator

jman-msc
Apr 2, 2026
Author

imran-siddique
Apr 5, 2026
Collaborator

imran-siddique
Apr 5, 2026
Collaborator

jman-msc Apr 9, 2026
Author