-
Notifications
You must be signed in to change notification settings - Fork 49
Powerpipe Export Timeout with Large JSONB Fields #950
Description
Powerpipe Export Timeout with Large JSONB Fields
Describe the bug
Powerpipe control exports (--export=filename.json) timeout when processing controls with large JSONB fields containing complex nested data, while JSON console output (--output=json) works correctly. The issue appears to be in the JSON exporter's handling of large JSON/JSONB fields during marshaling, causing performance degradation and eventual timeouts.
Powerpipe version
Powerpipe v1.4.0
Powerpipe v0.0.0-dev-develop.20251002184514 (built from commit ea87a9e)
Steampipe v1.1.4
To reproduce
-
Setup: Install Powerpipe and any Steampipe plugin with tables containing large JSONB fields (e.g., cloud provider policies with complex assignment rules)
-
Repro steps:
# This times out (exit code 124) and creates no file timeout 30s powerpipe control run your_control_name \ --mod-location "/path/to/mod" \ --export test_export.json \ --progress=false # This works correctly (exit code 1, normal for controls with alarms) powerpipe control run your_control_name \ --mod-location "/path/to/mod" \ --output=json > test_output.json
-
Control SQL example (generic pattern):
control example_control_with_large_jsonb { title = "Example Control with Large JSONB Fields" sql = <<EOT select resource_id, name, large_jsonb_field, -- Contains complex nested JSON arrays/objects case when large_jsonb_field is null or large_jsonb_field = '[]'::jsonb then 'none' else 'complex-data' end as data_summary from some_plugin_table where large_jsonb_field is not null EOT }
Expected behavior
Both --export=filename.json and --output=json should work correctly with large JSONB fields, returning exit codes indicating success or control results (0, 1, or 2).
Actual behavior
--export=filename.json: Times out after 30+ seconds (exit code 124), creates no output file--output=json: Works correctly, produces substantial JSON output, proper exit code
Failed Workarounds Attempted
1. JSONB Field Truncation
Attempted: Truncating large JSONB fields to prevent marshaling issues
case
when length(large_field::text) > 1000 then
left(large_field::text, 1000) || '...'
else large_field::text
end as truncated_fieldResult: Still times out - truncation doesn't prevent the underlying marshaling issue
2. Length-Based Conditional Logic
Attempted: Using jsonb_array_length() with size checks
case
when length(assignments::text) < 10000 and jsonb_array_length(assignments) = 1 then
assignments -> 0 ->> 'field'
when length(assignments::text) < 10000 then
'multiple (' || jsonb_array_length(assignments) || ' items)'
else 'large-data'
end as smart_summaryResult: Still times out - JSON parsing functions themselves cause performance issues
3. Simplified Assignment Summaries
Attempted: Replacing complex nested field parsing with simple existence checks
case
when large_field is null or large_field = '[]'::jsonb then 'none'
else 'assigned'
end as simple_summaryResult: Reduced but did not eliminate timeouts with very large fields
Root Cause Analysis
Based on source code investigation:
-
Different Code Paths:
--exportusespipe-fittingsJSON exporter →querydisplay.BuildJSON()→ParseJSONOutputColumnValue()--output=jsonuses direct JSON streaming/output handling
-
Problematic Function:
pipe-fittings/querydisplay/BuildJSON()attempts to marshal large JSONB fields entirely in memory, causing timeouts with complex assignment data -
JSONB Field Content: Fields contain arrays with complex nested objects:
[ { "target": { "@odata.type": "#microsoft.graph.someAssignmentTarget", "filterId": null, "filterType": "none" }, "source": "direct", "sourceId": null } ]
Additional context
- Environment: Reproduced in WSL/Linux environment
- Custom Build: Built Powerpipe from source (latest develop branch) - issue persists
- Workaround Currently Used: Using
--output=jsonand redirecting to file works correctly - Impact: Affects controls with any large JSONB fields containing complex nested data
- Related Files:
pipe-fittings/querydisplay/display.go-BuildJSON()functionpipe-fittings/querydisplay/column.go-ParseJSONOutputColumnValue()function