Skip to content

Releases: emmanuelgjr/GenAI-Security-Crosswalk

v3.1.0 — 25 Frameworks, 114 Incidents, Gap Analysis, OSCAL/GRC Export

10 Apr 03:54

Choose a tag to compare

Highlights

Registry Coverage: 8% → 100%

  • 11 new framework registries + 12 existing updated → 25 frameworks, 1,514 controls
  • Classifier P@1: 0.073 → 0.585 (8x improvement)

Incident Database: 50 → 114

  • 64 new incidents sourced from OWASP ASI Agentic Exploits tracker, CVE databases, and curated research
  • Covers: Claude state-sponsored attacks, MCP supply chain exploits, IDE universal RCE patterns, zero-click enterprise data exfiltration, vibe-coding security disasters
  • Categories: 64 real-world, 47 research-demonstrated, 3 red-team
  • Date range: 2022-2026 with heavy 2025 agentic coverage

Gap Analysis Visualization

  • New #/gaps page — select frameworks, see red/yellow/green heatmap
  • PDF export (board-ready) and CSV export

New Tools

  • `classifier/finetune.py` — contrastive fine-tuning pipeline for BGE-small
  • `scripts/framework-diff.js` — version diffing with changelog generation
  • `scripts/extract-registry.js` — auto-extract controls from entry data

Enhanced Export

  • `--format oscal-catalog` — OSCAL 1.1.2 Catalog with OWASP coverage annotations
  • `--format grc` — ServiceNow/Archer/Drata-ready JSON

Stats

Metric v3.0.0 v3.1.0
Frameworks 14 25
Controls 505 1,514
Incidents 50 114
GT Coverage 8.0% 100%
P@1 0.073 0.585
Export formats 4 6

Full changelog: CHANGELOG.md

v2.0.0 — Web App, Evidence-Based Scoring, 50 Incidents, Leaderboard

29 Mar 05:06

Choose a tag to compare

Major release

Live web app

https://emmanuelgjr.github.io/GenAI-Security-Crosswalk/

7 pages: Landing, Explorer, Frameworks, Incidents, Score, Leaderboard, About. Dark mode, responsive, zero dependencies.

Evidence-based scoring

Three validation tiers — because self-assessed checkboxes are meaningless:

Tier How
Self-Assessed Check framework boxes
Tool-Verified Upload Garak/PyRIT/LAAF/compliance JSON — schema + fingerprint validated
Independently Attested Third-party assessor submits via GitHub Issue

Suspicious results (all-pass, all-zero) are automatically flagged.

50 incidents — 100% entry coverage

Every one of 41 OWASP entries has at least one documented incident. 19 new incidents covering the 2025 gap year including DeepSeek, o1/o3 jailbreaks, Cursor AI secrets, Clearview bias settlement, Apollo Research scheming, and more.

21 recipes — 5 deployment architectures

8 new recipes: agent memory sanitization, multi-agent message validation, credential rotation, output guardrails, training data provenance, PII redaction, differential privacy, data retention enforcement.

25 eval profiles

6 new Garak (LLM03/05/08/10, ASI07/08) + 3 new PyRIT (ASI04, DSGAI08, DSGAI17).

70+ tools

17 new: Inspect AI, TextAttack, Counterfit, Foolbox, Mindgard, Agentic Security, OpenAI Evals, Vigil, Arize Phoenix, AgentOps, LangSmith, Weave, OpenLLMetry, WhyLogs, Evidently, MLflow.

CLI query interface

node scripts/query.js --stats
node scripts/query.js --severity Critical
node scripts/query.js --framework "EU AI Act"
node scripts/query.js --entry LLM01 --mappings
node scripts/query.js --incident-search "deepfake"

Leaderboard

Public ranking at #/leaderboard. Submit via GitHub Issue template. Validation tier shown per entry.

npm package

npm install @owasp/genai-crosswalk

TypeScript types for all data structures. 12 tests passing.


Stats: 20 frameworks · 67 mapping files · 50 incidents · 41 entries · 21 recipes · 25 eval profiles · 70+ tools · 7 scripts

Created and led by Emmanuel Guilherme Junior — OWASP GenAI Data Security Initiative Lead

v1.8.0 — Web App, Evidence-Based Scoring, Viral Sharing

28 Mar 21:02

Choose a tag to compare

What's new

GitHub Pages web app — live now

https://emmanuelgjr.github.io/GenAI-Security-Crosswalk/

6-page SPA: Landing · Explorer · Frameworks Matrix · Incidents · Score · About

Dark mode, responsive, zero dependencies, keyboard accessible.

Evidence-based scoring

Three tiers — because self-assessed checkboxes are meaningless:

Tier Badge How
Self-Assessed ❓ Grey Check framework boxes
Partially Validated ⚠️ Amber Upload at least one tool output
Tool-Validated ✅ Green 20+ entries validated by evidence

Upload JSON from your crosswalk tool runs:

  • node scripts/compliance-report.js --format json
  • Garak JSON results
  • PyRIT JSON results
  • python evals/laaf/laaf_crosswalk.py --format json

Score measures control depth (severity-weighted average), not binary framework presence.

Viral sharing

  • 🖼️ Canvas-rendered 1200×630 PNG score card
  • LinkedIn / X share buttons with pre-filled text
  • Embeddable SVG badge with copy-paste code
  • Deep links: #/explorer/LLM01, #/incidents/INC-001
  • OG image for social previews

Stats: 20 frameworks · 67 mapping files · 31 incidents · 41 entries · TypeScript SDK · Evidence-based scoring

v1.7.0 — FedRAMP + DORA, npm Package, SBOM, 20 Frameworks

28 Mar 17:42

Choose a tag to compare

What's new

Frameworks 19 & 20: FedRAMP + DORA

FedRAMP AI overlay — US federal cloud AI authorization (SP 800-53 Rev 5 controls):

  • LLM_FedRAMP.md · Agentic_FedRAMP.md · DSGAI_FedRAMP.md
  • AC/AU/CA/CM/IA/IR/PM/RA/SA/SC/SI/SR control families per entry

DORA — EU Digital Operational Resilience Act (Regulation 2022/2554, mandatory for financial entities):

  • LLM_DORA.md · Agentic_DORA.md · DSGAI_DORA.md
  • Art. 5–45 per entry (ICT risk, incident reporting, resilience testing, third-party oversight)

npm package: @owasp/genai-crosswalk

npm install @owasp/genai-crosswalk
import { getEntry, getFramework, searchEntries, incidents } from '@owasp/genai-crosswalk';

Full TypeScript types for Entry, Incident, Mapping, MaestroLayer. API: getEntry(), getFramework(), searchEntries(), getBySeverity(), getIncidentsForEntry(), getIncidentsByLayer(). 12 smoke tests included.

SBOM generation

  • .github/workflows/sbom.yml — CycloneDX SBOM on every release tag
  • scripts/sbom-inventory.js — content-level SBOM of all data assets (SHA-256 hashed)

generate.js parser fix

SP 800-218A, FedRAMP, and DORA added to FRAMEWORK_FILES catalog — all three now parsed into data/entries/ JSON. 66 of 67 mapping files parsed (AIVSS remains special-cased).


Stats: 20 frameworks · 67 mapping files · 31 incidents · 41 entries · 6 scripts · TypeScript SDK

v1.6.0 — NIST SP 800-218A, 31 Incidents, STIX/OSCAL, Automation

28 Mar 17:03

Choose a tag to compare

What's new

18th framework: NIST SP 800-218A

Secure Software Development Practices for Generative AI and Dual-Use Foundation Models — 3 new mapping files (3,230 lines) covering all 41 OWASP entries across PW/PS/RV practice groups.

File Entries
llm-top10/LLM_SP800218A.md LLM01–LLM10
agentic-top10/Agentic_SP800218A.md ASI01–ASI10
dsgai-2026/DSGAI_SP800218A.md DSGAI01–DSGAI21

10 new incidents (31 total)

ID Title Severity
INC-022 Greshake et al. indirect prompt injection Critical
INC-023 Morris II multi-agent worm Critical
INC-024 Slack AI indirect injection Critical
INC-025 GitHub Copilot Workspace injection High
INC-026 AI deepfake CEO fraud ($25.6M) Critical
INC-027 MathPrompt symbolic jailbreak Critical
INC-028 Many-shot jailbreaking (Anthropic) High
INC-029 Crescendo multi-turn attack (Microsoft) High
INC-030 Skeleton Key direct override (Microsoft) High
INC-031 Meta Galactica misinformation takedown High

Enterprise export formats

node scripts/incidents-report.js --format stix   # STIX 2.1 → Splunk/Sentinel
node scripts/compliance-report.js --format oscal  # OSCAL 1.1.2 → GRC platforms

Automated source monitoring

scripts/watch.js monitors OWASP repos, arXiv, NVD CVEs, and framework pages weekly — opens GitHub Issues for detected changes. CI workflow at .github/workflows/weekly-watch.yml.

Infrastructure

  • package.json — reproducible installs (node >=18), npm scripts
  • i18n/es/README.md — Spanish seed translation
  • i18n/ja/README.md — Japanese seed translation
  • i18n/de/README.md — German seed translation

Stats: 18 frameworks · 61 mapping files · 31 incidents · 41 entries · 5 scripts (3,610 lines)

v1.5.7 — LAAF v2.0 LPCI Red-Teaming Integration

28 Mar 14:29

Choose a tag to compare

What's new

LAAF v2.0 integration — the first automated red-teaming framework purpose-built for Logic-layer Prompt Control Injection (LPCI) vulnerabilities in agentic LLM systems.

New: evals/laaf/

File Purpose
README.md LAAF vs Garak vs PyRIT comparison, technique taxonomy × OWASP crosswalk, CI integration guide
run_laaf.sh 6-stage suite runner with per-stage thresholds (S1/S3/S4/S6 = 0%, S2 = 5%, S5 = 10%)
laaf_crosswalk.py OWASP/MAESTRO-mapped crosswalk reporter (MD/CSV/JSON)
stage_configs/s1–s6.yaml Crosswalk-specific stage configs with realistic agentic deployment personas

New: data/tools-supplement.json

Persistent supplement mechanism for tools not extractable from markdown files. LAAF v2.0 added to 8 OWASP entries: LLM01, LLM06, LLM07, ASI01, ASI02, ASI03, ASI06, DSGAI04.

Updated: data/incidents.json → v1.1.0

INC-021 added: LAAF empirical study — 67% (GPT-4o-mini), 85% (Claude-3-Sonnet), 92% (Gemini-2.0-Flash) breakthrough rates across 5 production LLMs (arXiv:2507.10457).

Updated: evals/ci/github-action.yml

New laaf-eval job: 5 LAAF scan steps with stage → OWASP failure annotations, schedule + manual dispatch only, 45-min timeout.

Fixed: scripts/generate.js

  • acc[id].tools = mergeTools(acc[id].tools, tools) — result was not being assigned back (tools were silently dropped)
  • Incidents loading from data/incidents.json survives re-generation
  • mergeTools() deduplicates by tool name across markdown-parsed + supplement sources

LPCI Attack Vectors mapped:

  • AV-1 Tool Poisoning → LLM07, ASI01
  • AV-2 Memory-Persistent Encoded Triggers → ASI06, LLM06, DSGAI04
  • AV-3 Role Override → LLM01, ASI01, ASI02
  • AV-4 Vector Store Payload Persistence → DSGAI04, LLM01, ASI03

arXiv: 2507.10457 | Framework: LAAF v2.0

v1.5.6 — Incident Tracker

28 Mar 13:48

Choose a tag to compare

What's new in v1.5.6

Incident tracker — 20 incidents with MAESTRO layer attribution

data/incidents.json catalogues 20 real-world and research-demonstrated AI security incidents, each mapped to OWASP entries and MAESTRO architectural layers.

MAESTRO layer roles

Each incident records how each architectural layer was involved:

  • Origin — where the attack initiates
  • Propagation — how it spreads through the system
  • Impact — where harm manifests
  • Blind-spot — where detection or prevention failed

Query tool

node scripts/incidents-report.js # all incidents → reports/
node scripts/incidents-report.js --entry LLM01 # by OWASP entry
node scripts/incidents-report.js --layer L3 # by MAESTRO layer
node scripts/incidents-report.js --category real-world
node scripts/incidents-report.js --severity Critical
node scripts/incidents-report.js --format csv # Excel

Incidents included (20)

ID Incident Severity Category
INC-001 Samsung ChatGPT data leak High Real
INC-002 Bing/Sydney jailbreak High Real
INC-003 ChatGPT indirect injection via web Critical Research
INC-004 Air Canada chatbot hallucination lawsuit High Real
INC-005 Chevrolet chatbot $1 car Medium Real
INC-006 OpenAI Redis conversation history leak High Real
INC-007 LLM email assistant indirect injection Critical Research
INC-008 GitHub Copilot code/secret memorisation High Research
INC-009 Hugging Face pickle malware supply chain Critical Real
INC-010 Microsoft Copilot document exfiltration Critical Research
INC-011 WormGPT dark-web adversarial LLM High Real
INC-012 LangChain/LlamaIndex RCE via injection Critical Research
INC-013 Perez & Ribeiro foundational injection paper Critical Research
INC-014 Clarkesworld AI fiction spam Medium Real
INC-015 Multimodal image-embedded injection High Research
INC-016 RAG corpus poisoning (PoisonedRAG) Critical Research
INC-017 AutoGPT uncontrolled execution High Research
INC-018 GPT-4 system prompt extraction High Real
INC-019 AI agent IAM privilege escalation Critical Research
INC-020 Multi-agent injection cascade Critical Research

Also: generate.js now reads incidents.json and populates incidents[] in all 41 entry files.

v1.5.5 — Compliance Gap Reports

28 Mar 13:37

Choose a tag to compare

What's new in v1.5.5

Compliance gap assessment generator

scripts/compliance-report.js — one command produces framework-centric gap reports for all 17 mapped frameworks.

Quick start

node scripts/compliance-report.js # all 17 frameworks → reports/
node scripts/compliance-report.js --framework "EU AI Act" # single framework
node scripts/compliance-report.js --format csv # Excel / Google Sheets
node scripts/compliance-report.js --format json # dashboards / CI
node scripts/compliance-report.js --list-frameworks # list options

What each report contains

  • Executive summary (coverage %, Critical/High gap count, PASS/ACTION REQUIRED status)
  • Framework context (deadline, audience, scope)
  • Gap analysis — OWASP entries with no controls mapped, ordered by severity
  • Coverage matrix — entries × controls, grouped by source list
  • Control detail — every referenced control with linked OWASP entries and notes
  • Prioritised action plan (P1 immediate → P3 medium-term + CI/CD integration)

Supported frameworks (17)

EU AI Act · ISO/IEC 42001:2023 · NIST AI RMF 1.0 · ISO/IEC 27001:2022 · SOC 2 · NIST CSF 2.0 · CIS Controls v8.1 · ISA/IEC 62443 · NIST SP 800-82 Rev 3 · OWASP ASVS · OWASP SAMM · PCI DSS v4.0 · MITRE ATLAS · MAESTRO · AIUC-1 · ENISA Multilayer Framework · OWASP NHI Top 10

Also: reports/ and evals/results/ added to .gitignore.

v1.5.4 — Evaluation Profiles

28 Mar 13:27

Choose a tag to compare

What's new in v1.5.4

Runnable evaluation profiles — Garak + PyRIT

Every OWASP entry now has a corresponding runnable test. Drop these into your CI/CD pipeline to gate merges on security regressions.

Garak profiles (7)

Profile OWASP entry Threshold
LLM01_prompt_injection.yaml LLM01 Prompt Injection 10%
LLM02_sensitive_disclosure.yaml LLM02 Sensitive Info Disclosure 5%
LLM04_data_poisoning.yaml LLM04 Data/Model Poisoning 10%
LLM07_system_prompt_leakage.yaml LLM07 System Prompt Leakage 0% (zero tolerance)
LLM09_misinformation.yaml LLM09 Misinformation 15%
ASI01_goal_hijack.yaml ASI01 Agent Goal Hijack 5%
ASI05_code_execution.yaml ASI05 Code/Resource Execution 0% (zero tolerance)

PyRIT scripts (3)

Script OWASP entry Scenarios
llm01_prompt_injection.py LLM01 Direct, indirect, goal deviation
dsgai04_rag_poisoning.py DSGAI04 RAG Poisoning 6 poisoned corpus chunks
asi01_goal_hijack.py ASI01 Agent Goal Hijack 6 multi-turn adversarial scenarios

CI/CD template

evals/ci/github-action.yml — copy to .github/workflows/ in any repo with LLM code. Garak runs on every PR touching src/llm/, prompts/, or evals/**. PyRIT runs on weekly schedule and manual dispatch to control cost.

Quick start

pip install garak
export OPENAI_API_KEY=sk-...
bash evals/garak/run_all.sh

See evals/README.md for full setup, result interpretation, and threshold customisation.

v1.5.1 — Complete Coverage Matrix + Contributor Infrastructure

28 Mar 03:25

Choose a tag to compare

GenAI Security Crosswalk v1.5.1

The first fully complete release — all 17 frameworks mapped across all 3 OWASP GenAI source lists, plus the full contributor and operational infrastructure to sustain and scale the project.


What's in this release

Complete framework coverage matrix (v1.5.0)

All 51 matrix cells are now ✅ — every framework mapped to every source list.

Metric Count
Source lists 3 (LLM Top 10 2025, Agentic Top 10 2026, DSGAI 2026)
Frameworks 17
Mapping files 58
Implementation recipes 13
Open-source tools catalogued 40+

New mapping files closing the matrix:

  • Agentic_SAMM.md — OWASP SAMM v2.0 maturity scorecard for agentic AI
  • DSGAI_SAMM.md — SAMM maturity roadmap for GenAI data security (includes GDPR baseline)
  • Agentic_NISTSP80082.md — OT agent placement, SP 800-53 controls, NERC CIP/AWIA/CMMC crosswalk
  • DSGAI_NISTSP80082.md — GenAI data placement in OT networks, U.S. regulatory crosswalk
  • LLM_AIUC1.md — AIUC-1 six-domain certification readiness for LLM deployments
  • DSGAI_AIUC1.md — Domain A (Data & Privacy) covers 50%+ of DSGAI entries
  • LLM_NHI.md — NHI credential controls for LLM deployments
  • DSGAI_NHI.md — NHI as enabling condition for DSGAI risks

Contributor infrastructure (v1.5.1)

  • scripts/validate.js — content validator with 10 structural checks
  • .github/workflows/validate.yml — CI pipeline running on every PR
  • GOVERNANCE.md — maintainer roles, PR review SLOs, decision-making process
  • shared/TEMPLATE.md — canonical template for new mapping files
  • data/README.md — data layer documentation with jq query examples
  • i18n/WORKFLOW.md — translation contributor guide (es, fr, pt ready)
  • Three issue templates: bug report, content update, new framework proposal
  • Enhanced PR template with content and hygiene checklists
  • Expanded CODEOWNERS covering all folders

Start here — by role

EU AI Act compliance before August 2026
→ LLM_EUAIAct.md · Agentic_EUAIAct.md · DSGAI_EUAIAct.md

Deploying autonomous agents
→ Agentic_MAESTRO.md → Agentic_AIVSS.md → Agentic_OWASP_NHI.md

SOC 2 / GRC assessment
→ LLM_SOC2.md · Agentic_SOC2.md · LLM_SAMM.md

AppSec / red team test plan
→ Agentic_AITG.md · DSGAI_MITREATLAS.md · shared/RECIPES.md

AI in OT/ICS environments
→ Agentic_NISTSP80082.md · Agentic_ISA62443.md · DSGAI_ISA62443.md


Maintained by Emmanuel Guilherme Junior and the OWASP GenAI Data Security Initiative.

License: CC BY-SA 4.0