Skip to content

feat(outputs): add AWS inventory connectivity graph output format#10382

Open
sandiyochristan wants to merge 2 commits intoprowler-cloud:masterfrom
sandiyochristan:feat/inventory-graph
Open

feat(outputs): add AWS inventory connectivity graph output format#10382
sandiyochristan wants to merge 2 commits intoprowler-cloud:masterfrom
sandiyochristan:feat/inventory-graph

Conversation

@sandiyochristan
Copy link
Copy Markdown
Contributor

@sandiyochristan sandiyochristan commented Mar 19, 2026

Context

Prowler's existing output formats (CSV, ASFF, OCSF, HTML) surface individual check findings but provide no cross-service topology view. Security engineers need to understand how AWS resources are connected — which Lambda functions sit inside which VPC, which IAM roles can be assumed by which services, which event sources trigger which functions — before they can reason about attack paths, blast-radius, or lateral-movement risk.

This PR adds a new --output-formats inventory-graph mode that derives a connectivity graph from the service clients already loaded during a scan, with zero extra AWS API calls, and writes two artefacts:

  • <output>.inventory.json — machine-readable nodes + edges graph
  • <output>.inventory.html — self-contained interactive D3.js force-directed visualization

Related to the Prowler attack-path roadmap (cartography / Neo4j integration).


Description

New module: prowler/lib/outputs/inventory/

prowler/lib/outputs/inventory/
├── models.py              ResourceNode, ResourceEdge, ConnectivityGraph dataclasses
├── graph_builder.py       Reads loaded service clients from sys.modules (zero extra API calls)
├── inventory_output.py    write_json(), write_html(), generate_inventory_outputs() entry-point
└── extractors/
    ├── lambda_extractor   Lambda functions → VPC/subnet/SG/event-sources/layers/DLQ/KMS
    ├── ec2_extractor      EC2 instances + security groups → subnet/VPC
    ├── vpc_extractor      VPCs, subnets, peering connections
    ├── rds_extractor      RDS instances → VPC/SG/cluster/KMS
    ├── elbv2_extractor    ALB/NLB load balancers → SG/VPC
    ├── s3_extractor       S3 buckets → replication targets/logging buckets/KMS keys
    └── iam_extractor      IAM roles + trust-relationship edges (who can assume what)

Edge semantic types (used for downstream filtering / attack-path analysis):
network · iam · triggers · data_flow · depends_on · replicates_to · encrypts · logs_to

Changes to existing files

File Change
prowler/config/config.py Added "inventory-graph" to available_output_formats; added inventory_graph_file_suffix = ".inventory"
prowler/__main__.py Lazy import + dispatch for mode == "inventory-graph" after the existing format handlers
prowler/CHANGELOG.md New entry under ## [5.21.0] 🚀 Added

Key design decisions

Decision Rationale
Read from sys.modules Zero extra AWS API calls; services not scanned are silently skipped — output degrades gracefully when only a subset of checks ran
Self-contained HTML D3.js v7 via CDN; no server, no build step; opens in any browser
One extractor file per service Each extractor is independently testable; adding a new service = one new file + one line in the registry
Typed edges Semantic types allow downstream consumers (attack-path tools, Neo4j import) to filter by relationship class
Lazy import in __main__.py Module-level import avoided so the inventory module is not loaded on every Prowler run

HTML graph features

  • Force-directed layout with drag-and-drop node pinning
  • Zoom / pan (mouse wheel + click-drag on background)
  • Per-service colour-coded nodes with a legend
  • Hover tooltips showing ARN + all metadata properties
  • Service filter dropdown (show only Lambda, EC2, RDS, etc.)
  • Adjustable link-distance and charge-strength physics sliders
  • Edge labels on every arrow

Steps to review

  1. Code reviewprowler/lib/outputs/inventory/ is self-contained; start with models.pygraph_builder.pyinventory_output.py, then any extractor of interest.

  2. Run locally against a real or mocked AWS account:

    # Run any subset of checks + generate graph output
    prowler aws --output-formats inventory-graph
    
    # Combined with other formats
    prowler aws --output-formats csv html inventory-graph

    Open output/<timestamp>.inventory.html in a browser.

  3. Smoke test (no AWS credentials needed):

    import sys
    from unittest.mock import MagicMock
    
    # Wire a fake Lambda client
    mock_module = MagicMock()
    mock_fn = MagicMock()
    mock_fn.arn = "arn:aws:lambda:us-east-1:123:function:test"
    mock_fn.name = "test"
    mock_fn.region = "us-east-1"
    mock_fn.vpc_id = "vpc-abc"
    mock_fn.security_groups = ["sg-111"]
    mock_fn.subnet_ids = {"subnet-aaa"}
    mock_fn.environment = None
    mock_fn.kms_key_arn = None
    mock_fn.layers = []
    mock_fn.dead_letter_config = None
    mock_fn.event_source_mappings = []
    mock_module.awslambda_client.functions = {mock_fn.arn: mock_fn}
    mock_module.awslambda_client.audited_account = "123"
    sys.modules["prowler.providers.aws.services.awslambda.awslambda_client"] = mock_module
    
    from prowler.lib.outputs.inventory.graph_builder import build_graph
    from prowler.lib.outputs.inventory.inventory_output import write_json, write_html
    
    graph = build_graph()
    write_json(graph, "/tmp/test.inventory.json")
    write_html(graph, "/tmp/test.inventory.html")
    # Open /tmp/test.inventory.html in a browser
  4. Verify zero impact on existing formats — running prowler aws --output-formats csv html still works identically; the new code path is only entered when mode == "inventory-graph".

  5. Verify CHANGELOG entryprowler/CHANGELOG.md updated under ## [5.21.0] 🚀 Added.


Checklist

Community Checklist
  • This feature is listed in the Prowler attack-path / inventory roadmap (cartography integration direction)
  • No existing issue was assigned for this specific output format; opened as a new community contribution
  • Code is covered by a smoke test (end-to-end: mock client → graph builder → JSON + HTML output verified)
  • New module follows existing Prowler code style (logger usage, try/except with logger.error, no bare print)
  • CHANGELOG.md updated (prowler/CHANGELOG.md)
  • Backport — not required (new feature, not a bug fix)
  • README update — could add a line under output formats; happy to add if maintainers prefer

SDK/CLI

  • Are there new checks included in this PR? No — this PR adds a new output format only, no checks.
  • No new AWS IAM permissions required — the graph builder reads from already-loaded in-memory service clients.

License

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Introduces `--output-formats inventory-graph` which produces two files
after a scan completes:

  <output>.inventory.json  – machine-readable nodes + edges graph
  <output>.inventory.html  – interactive D3.js force-directed graph

Why
---
Prowler's existing outputs (CSV, ASFF, OCSF) report individual check
findings but provide no cross-service topology view.  Security engineers
need to understand _how_ resources are connected before they can reason
about attack paths, blast-radius, or lateral movement risk.  This output
fills that gap by building a connectivity graph from the service clients
that are already loaded during a scan.

What
----
• prowler/lib/outputs/inventory/models.py
  ResourceNode / ResourceEdge / ConnectivityGraph dataclasses.

• prowler/lib/outputs/inventory/graph_builder.py
  Reads already-loaded service clients from sys.modules (zero extra API
  calls) and delegates to per-service extractors.  Services not scanned
  are silently skipped.

• prowler/lib/outputs/inventory/extractors/
  lambda_extractor  – functions, VPC/subnet/SG edges, ESM triggers,
                      layers, DLQ, KMS
  ec2_extractor     – instances, security groups, subnet/VPC edges
  vpc_extractor     – VPCs, subnets, peering connections
  rds_extractor     – DB instances, VPC/SG/cluster/KMS edges
  elbv2_extractor   – ALB/NLB, SG and VPC edges
  s3_extractor      – buckets, replication, logging, KMS edges
  iam_extractor     – roles, trust-relationship edges

• prowler/lib/outputs/inventory/inventory_output.py
  write_json()  – serialises graph to JSON
  write_html()  – embeds graph data in a self-contained D3.js page with
                  force-directed layout, zoom/pan, tooltips, per-service
                  colour coding, service filter, and physics controls.

• prowler/config/config.py
  Added "inventory-graph" to available_output_formats and
  inventory_graph_file_suffix = ".inventory".

• prowler/__main__.py
  Lazy import + call to generate_inventory_outputs() when mode ==
  "inventory-graph".

How it works
------------
1. Run Prowler as normal with any set of checks.
2. Add `--output-formats inventory-graph` (combinable with csv/html/etc.).
3. After checks finish the graph builder walks sys.modules looking for
   service clients that were loaded during the scan.
4. Each extractor turns the in-memory service objects into ResourceNode
   and ResourceEdge objects (no extra AWS API calls).
5. JSON + HTML files are written alongside other output files.

Usage
-----
  prowler aws --output-formats inventory-graph

The HTML file opens in any browser with no server needed.
@sandiyochristan sandiyochristan requested review from a team as code owners March 19, 2026 00:06
@github-actions github-actions bot added the community Opened by the Community label Mar 19, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 19, 2026

Conflict Markers Resolved

All conflict markers have been successfully resolved in this pull request.

encryption = getattr(bucket, "encryption", None)
versioning = getattr(bucket, "versioning_enabled", None)
logging = getattr(bucket, "logging", None)
public = getattr(bucket, "public_access_block", None)

Check notice

Code scanning / CodeQL

Unused local variable Note

Variable public is not used.
import os
from dataclasses import asdict
from datetime import datetime
from typing import Optional

Check notice

Code scanning / CodeQL

Unused import Note

Import of 'Optional' is not used.
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 19, 2026

Codecov Report

❌ Patch coverage is 0.68729% with 289 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.51%. Comparing base (5a3475b) to head (4dcadab).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #10382      +/-   ##
==========================================
+ Coverage   56.85%   63.51%   +6.66%     
==========================================
  Files          87      105      +18     
  Lines        2846     7022    +4176     
==========================================
+ Hits         1618     4460    +2842     
- Misses       1228     2562    +1334     
Flag Coverage Δ
prowler-py3.10-config 63.51% <0.68%> (?)
prowler-py3.10-lib 63.22% <0.00%> (?)
prowler-py3.10-oraclecloud ?
prowler-py3.11-config 63.51% <0.68%> (?)
prowler-py3.11-lib 63.22% <0.00%> (?)
prowler-py3.11-oraclecloud ?
prowler-py3.12-config 63.51% <0.68%> (?)
prowler-py3.12-lib 63.22% <0.00%> (?)
prowler-py3.12-oraclecloud ?
prowler-py3.9-config 63.51% <0.68%> (?)
prowler-py3.9-lib 63.22% <0.00%> (?)
prowler-py3.9-oraclecloud ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
prowler 63.51% <0.68%> (+6.66%) ⬆️
api ∅ <ø> (∅)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@jfagoagas jfagoagas added the not-planned Issues that are not in the Prowler roadmap. label Mar 19, 2026
@jfagoagas
Copy link
Copy Markdown
Member

jfagoagas commented Mar 19, 2026

Thanks for this contribution @sandiyochristan. For now I left the no-planned label as this specific kind of inventory is not something in our roadmap. We'll evaluate it and get back to you.

One thing I'd like to highlight is that in the Community Checklist we have the following two checks:

  • This feature/issue is listed in here or roadmap.prowler.com
  • Is it assigned to me, if not, request it via the issue/feature in here or Prowler Community Slack

It's important to leave them as is because we want to sync, having an issue first, with the community contributors before doing the implementation. This is to prevent duplications, things that may be not planned or not considered on our end, or simply to have a conversation first to define the implementation/fix. I hope you understand this. We want to always give the best support to our community.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community Opened by the Community not-planned Issues that are not in the Prowler roadmap.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants