SS-90: Add OTel telemetry instrumentation to SmartFix#139
Merged
JacobMagesHaskinsContrast merged 21 commits intorelease_candidatefrom Apr 13, 2026
Merged
SS-90: Add OTel telemetry instrumentation to SmartFix#139JacobMagesHaskinsContrast merged 21 commits intorelease_candidatefrom
JacobMagesHaskinsContrast merged 21 commits intorelease_candidatefrom
Conversation
Add spec section references, async context propagation notes, force_flush clarification, turn-based LLM span cardinality, and e2e testing bead (contrast-eo1). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Use standard OTEL_EXPORTER_OTLP_ENDPOINT env vars (no CONTRAST_HOST derivation) - Remove _NoOpSpan helper — use SDK built-in NoOpTracerProvider instead - Move files_modified to operation span (per-PR, not per-run) - Reuse TokenCostAccumulator/_log_cost_analysis() for token values in LLM spans Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add opentelemetry-sdk==1.37.0 and opentelemetry-exporter-otlp-proto-http==1.37.0 to requirements.txt (versions match google-adk transitive constraint >=1.36.0) - Regenerate requirements.lock - Add GITHUB_RUN_ID optional config field (defaults to ""), used as session.id on the OTel tool-run span Bead ID: contrast-37r Tests: passing (773 tests) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New src/otel_provider.py with: - initialize_otel(config): sets up TracerProvider when OTEL_EXPORTER_OTLP_ENDPOINT is configured; no-op otherwise - Resource attributes: service.name/version, contrast.org_id, vcs.* - OTLPSpanExporter() with no args (reads env vars automatically) - start_span(name): thin wrapper around trace.get_tracer().start_as_current_span() - shutdown_otel(): force_flush(2000ms) then shutdown, double-call safe New test/test_otel_provider.py: 11 tests using OTel InMemorySpanExporter and mocks covering all key behaviours. Bead ID: contrast-22m Tests: passing (784 tests) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Split main() into thin wrapper + _main_impl() to avoid re-indenting the 530-line body - main() calls initialize_otel(config), registers atexit shutdown, wraps _main_impl() in 'smartfix-run' span - session.id = config.GITHUB_RUN_ID set at span start - contrast.smartfix.vulnerabilities_total set in finally (0 when no vulns processed; incremented past the existing-PR skip check) - shutdown_otel() called in both finally and atexit (double-call safe) - Added test_main_initializes_and_shuts_down_otel to test_main.py Bead ID: contrast-3ta Tests: passing (785 tests) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Wrap each vulnerability iteration in a 'fix-vulnerability' child span (child of smartfix-run), active during LiteLLM calls for context propagation - Set request-side attributes immediately: contrast.finding.fingerprint, contrast.finding.source, contrast.finding.rule_id, contrast.smartfix.coding_agent, contrast.smartfix.llm_provider - Set response-side attributes in finally: contrast.finding.language, contrast.smartfix.fix_applied, contrast.smartfix.files_modified, contrast.smartfix.pr_created, contrast.smartfix.pr_url - Add GitOperations.get_staged_files_count() using git diff --cached - Use except BaseException to catch SystemExit from error_exit() Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add _derive_system() to map model strings to gen_ai.system values
- Refactor _log_cost_analysis() to return (input, output, cache_read, cache_write) 4-tuple
- Wrap each _call_llm_with_retry() attempt in a 'chat {model}' OTel span
- Set gen_ai.* request/response/usage attributes on each LLM span
- Record exception info and StatusCode.ERROR on retryable/terminal failures
- Move _log_cost_analysis() call from generate_content_async() into _call_llm_with_retry()
- Add 15 new tests (TestDeriveSystem, TestLogCostAnalysisReturnValue, TestCallLlmWithRetryOtelSpan)
- Fix retry tests: mock _log_cost_analysis to return 4-tuple
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add test_otel_spans.py: 25 integration tests using InMemorySpanExporter
- TestSmartFixRunSpan: session.id, vulnerabilities_total present; files_modified/pr_created/pr_url absent
- TestFixVulnerabilitySpan: all contrast.finding.* and contrast.smartfix.* attributes
- TestLlmCallSpan: gen_ai.* attributes, 'chat {model}' name format, parent-child span nesting
- Patches src.otel_provider.start_span to bypass OTel SDK's Once global-state restriction
- Fix test_smartfix_litellm_retry.py: mock _log_cost_analysis to return (0,0,0,0) 4-tuple
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ontrast-eo1) Adds grafana/otel-lgtm stack for manually verifying SmartFix span output without the full Contrast API. Includes instructions for running against it and checklist of span attributes/hierarchy to verify in Tempo. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- tests/otel/run_otel_test.sh: orchestrates the full test - auto-detects Docker socket (Docker Desktop or Rancher Desktop) - starts grafana/otel-lgtm stack, waits for Grafana + Tempo health checks - runs SmartFix via act with OTEL_EXPORTER_OTLP_ENDPOINT injected (host.docker.internal:4318) - runs verify_spans.py after the run with --verify-only flag for re-checks - tests/otel/verify_spans.py: standalone span verifier - queries Tempo HTTP API for most recent smartfix-run trace - asserts root span has session.id, vulnerabilities_total; lacks files_modified/pr_created/pr_url - asserts fix-vulnerability spans have all contrast.finding.* and contrast.smartfix.* attributes - asserts chat spans are children of fix-vulnerability with all gen_ai.* attributes - supports --wait for polling until spans appear, --trace-id for specific traces - docker-compose.otel.yml: expose port 3200 (Tempo HTTP API) for verify_spans.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The script no longer knows about any consuming repo. It only manages the LGTM Docker stack lifecycle (--start-only, --verify-only, --stop). The act invocation lives in the consuming test repo's own Makefile, keeping the dependency arrow pointing the right way. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a confirmation log so the act run log clearly shows whether OTel was initialized and which endpoint it is exporting to. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
shutdown_otel() was called inside the `with start_span("smartfix-run")`
block, so force_flush() ran while the root span was still open. The
BatchSpanProcessor exported all completed child spans but the root span
had not ended yet. When the with-block exited and ended the root span,
the processor was already shut down and dropped it.
Fix: wrap the with-block in a try/finally so shutdown_otel() is called
after the root span has ended and been submitted to the batch processor.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
SmartFixLiteLlm is instantiated while fix-vulnerability is the current OTel span. Capture that context in __init__ and pass it explicitly when creating each 'chat <model>' span so they are always direct children of fix-vulnerability, regardless of whatever ADK-internal spans (call_llm, invocation, agent_run) may be active at LLM-call time. - otel_provider.start_span() gains an optional context parameter - SmartFixLiteLlm.__init__ captures otel_context.get_current() - _call_llm_with_retry passes self._otel_context to start_span - Tests updated: mock_start_span lambdas accept context=None, and MagicMock(spec=) instances set _otel_context = None Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Collaborator
Author
Contributor
There was a problem hiding this comment.
Pull request overview
Adds OpenTelemetry (OTel) tracing to SmartFix (opt-in via standard OTEL env vars) and introduces test + local E2E verification infrastructure to ensure span hierarchy/attributes match the SS-90 observability spec.
Changes:
- Introduces
src/otel_provider.pyto manage OTel initialization, span creation, and shutdown/flush behavior. - Instruments
src/main.pyandsrc/smartfix/extensions/smartfix_litellm.pyto emitsmartfix-run,fix-vulnerability, andchat {model}spans with required attributes. - Adds extensive unit/integration tests plus local verification tooling (
docker-compose.otel.yml,tests/otel/*) to validate spans without external dependencies.
Reviewed changes
Copilot reviewed 17 out of 18 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/otel/verify_spans.py | Tempo-querying verifier script to assert span structure/attributes against the spec. |
| tests/otel/run_otel_test.sh | Helper script to run the local Grafana/Tempo stack and execute span verification. |
| test/test_smartfix_litellm.py | Adds tests for _derive_system, _log_cost_analysis return values, and per-attempt chat span creation. |
| test/test_smartfix_litellm_retry.py | Adjusts retry tests to match new _log_cost_analysis contract and OTel context usage. |
| test/test_otel_spans.py | In-memory exporter integration tests validating span names/attributes/parent-child relationships. |
| test/test_otel_provider.py | Unit tests for provider lifecycle and “no-op when disabled” behavior. |
| test/test_main.py | Tests for root/operation span creation and required request/response attributes. |
| test/test_git_operations.py | Adds coverage for counting staged files. |
| test/test_config_integration.py | Adds coverage for GITHUB_RUN_ID defaulting and env reading. |
| src/smartfix/extensions/smartfix_litellm.py | Emits chat {model} spans per attempt; refactors cost parsing to return token counts for span attributes. |
| src/smartfix/domains/scm/git_operations.py | Adds get_staged_files_count() used for operation-span attributes. |
| src/requirements.txt | Adds OTel SDK + OTLP HTTP exporter deps. |
| src/requirements.lock | Regenerates lock metadata for new direct requirements. |
| src/otel_provider.py | New OTel provider module (init/start_span/shutdown). |
| src/main.py | Wraps execution in smartfix-run root span and creates fix-vulnerability spans per vulnerability with attributes. |
| src/config.py | Adds optional GITHUB_RUN_ID config field (defaults to empty string). |
| docker-compose.otel.yml | Local Grafana/otel-lgtm stack for manual E2E verification. |
| .beads/issues.jsonl | Records SS-90 task tracking updates. |
Alex-Contrast
approved these changes
Apr 10, 2026
…x/domains/telemetry/ and update imports
The script runs with set -euo pipefail, so a non-zero exit from verify_spans.py would terminate immediately before VERIFY_EXIT=$? or the success/failure messaging could run. Use if/then/else to capture the exit code without triggering set -e.
Copilot flagged that trace.set_tracer_provider() is a one-time global and a pre-existing provider could cause spans to be lost. Add a design note explaining why this cannot happen in SmartFix: ADK's OTel setup only runs from its CLI web server (not the Runner API we use), and LiteLLM's OTel integration requires explicit callback configuration.
vuln_count was a local int so if _main_impl() exited via error_exit()/sys.exit() (BUILD_VERIFICATION_FAILED, GENERATE_PR_FAILURE) after incrementing the counter, the finally block would report 0 instead of the actual number processed. Use a mutable list so the finally block always reads the correct value regardless of how _main_impl exits.
Collaborator
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.




Summary
Adds OpenTelemetry (OTel) tracing to SmartFix per the AI Observability spec. Emits structured spans to any OTLP-compatible backend (disabled by default when
OTEL_EXPORTER_OTLP_ENDPOINTis unset).Span Hierarchy
Changes
New infrastructure
src/otel_provider.py— TracerProvider lifecycle: initialize,start_span(), shutdown withforce_flushdocker-compose.otel.yml— grafana/otel-lgtm stack for local E2E verificationInstrumentation
src/main.py—smartfix-runroot span (session.id,contrast.smartfix.vulnerabilities_total);fix-vulnerabilitychild span per vulnerability with allcontrast.finding.*andcontrast.smartfix.*attributes (fix_applied,files_modified,pr_created,pr_url)src/smartfix/extensions/smartfix_litellm.py—chat {model}span per LLM attempt with fullgen_ai.*attributes;_derive_system()maps model strings togen_ai.systemvalues;_log_cost_analysis()now returns token counts for span populationConfig
src/config.py/action.yml— no new required inputs; OTel is opt-in via standardOTEL_EXPORTER_OTLP_ENDPOINTenv varTests
test/test_otel_provider.py— TracerProvider lifecycle unit teststest/test_otel_spans.py— 25 integration tests usingInMemorySpanExporterverifying span names, attributes, and parent-child relationshipstest/test_smartfix_litellm.py—TestDeriveSystem,TestLogCostAnalysisReturnValue,TestCallLlmWithRetryOtelSpan(15 new tests)test/test_main.py— operation span request/response attribute teststest/test_git_operations.py—get_staged_files_count()testsTest plan
docker compose -f docker-compose.otel.yml up -d, setOTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318, run against a real repo, verify spans in Grafana Tempo at http://localhost:3000Jira
SS-90
🤖 Generated with Claude Code