feat: Send GenAI spans as V2 envelope items #6079

9 issues

code-review: Found 9 issues (2 high, 7 medium)

High

Typo 'spans' instead of 'span' causes test to capture no span items - `tests/integrations/openai_agents/test_openai_agents.py:528`

On line 528, capture_items("transaction", "spans") uses the incorrect item type "spans" instead of "span". The capture_items fixture filters items by item.type, which is "span" (singular). As a result, no span items will be captured, and line 537's filter item.type == "span" will return an empty list. The subsequent next() call on line 538-540 will raise a StopIteration exception, causing the test to fail.

Also found at:

tests/integrations/openai_agents/test_openai_agents.py:1731
tests/integrations/openai_agents/test_openai_agents.py:1796

Test accesses wrong span format - transaction spans have 'op' not 'attributes.sentry.op' - `tests/integrations/pydantic_ai/test_pydantic_ai.py:830-838`

In test_message_history, spans are extracted from second_transaction["spans"] (line 830) but then filtered using s["attributes"].get("sentry.op", "") (line 832). Transaction-embedded spans use the legacy format with s["op"] and s["data"], not s["attributes"]. This inconsistency will cause the filter to find zero matches since the spans don't have an attributes key, making the test assertions pass vacuously or fail with KeyError.

Also found at:

tests/tracing/test_misc.py:628

Medium

Wrong event variable passed to span conversion - uses original event instead of prepared event - `sentry_sdk/client.py:1134`

On line 1134, event (the original function parameter) is passed to _serialized_v1_span_to_serialized_v2_span() instead of event_opt (the prepared/processed event). The _prepare_event() function populates release, environment, and sdk fields from options (lines 805-811 in client.py), and applies scope data. Since _serialized_v1_span_to_serialized_v2_span() extracts these values to populate span attributes (like sentry.release, sentry.environment, sentry.sdk.name), using the original event will result in missing or incomplete attributes on the converted GenAI spans.

Sort key uses 'name' twice instead of 'name' and 'description' - `tests/integrations/google_genai/test_google_genai.py:330`

The sorting lambda uses t.get("name", "") twice as the sort key tuple, but the comment says "sort by name and description for comparison". This appears to be a copy-paste error during refactoring. The second key should be t.get("description", "") to match the stated intent and ensure deterministic ordering when multiple tools have the same name.

Test uses incorrect key 'attributes' instead of 'data' for inline_data - `tests/integrations/google_genai/test_google_genai.py:2153`

The test was changed to use attributes as the key for binary data in inline_data, but the Google GenAI SDK uses data. The transform_google_content_part function (sentry_sdk/ai/utils.py:286) accesses inline_data.get("data", ""), so this test now passes accidentally due to the code overwriting result["content"] with BLOB_DATA_SUBSTITUTE regardless of input. This means the test no longer validates correct handling of real Google GenAI inline_data dictionaries.

Also found at:

tests/integrations/pydantic_ai/test_pydantic_ai.py:490-496

Hardcoded SDK version will cause test failures on version bumps - `tests/integrations/huggingface_hub/test_huggingface_hub.py:523`

The test hardcodes "sentry.sdk.version": "2.58.0" instead of using mock.ANY like all other similar tests in this file and other test files. This will cause the test to fail when the SDK version is incremented, making this test brittle and requiring manual updates with each release.

Unused list comprehension results in dead code and no test assertions - `tests/integrations/langchain/test_langchain.py:1840-1844`

The list comprehension at lines 1840-1844 creates a list that is never assigned to a variable or used for any assertion. This makes the test test_langchain_embeddings_error_handling effectively test nothing after the error is raised - it only verifies that the ValueError is raised, but makes no assertions about the captured data. Additionally, the capture_items call at line 1821 only captures 'transaction' and 'span' types, but the comprehension filters for item.type == 'event', which would never match anyway.

Test assertions silently skipped due to missing 'span' in capture_items types - `tests/integrations/litellm/test_litellm.py:945`

At line 945, capture_items("transaction") only captures transaction items, but later assertions (lines 1020-1023, outside the hunk) iterate over items filtering for item.type == "span". Since spans aren't captured, the spans list will be empty and the for-loop never executes, causing the test to silently pass without verifying any span attributes.

Also found at:

tests/integrations/litellm/test_litellm.py:1020-1023

Removed assertion weakens test coverage for concurrent transaction capture - `tests/integrations/openai_agents/test_openai_agents.py:2275`

The original test test_multiple_agents_asyncio had an explicit assert len(events) == 3 to verify exactly 3 transactions were captured. This assertion was removed during refactoring. If fewer transactions are captured, unpacking will fail with a ValueError (not an assertion), and if more transactions are captured, extras are silently ignored due to generator unpacking semantics.

Duration: 39m 20s · Tokens: 14.5M in / 179.4k out · Cost: $20.80 (+extraction: $0.01, +merge: $0.01, +fix_gate: $0.03)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Send GenAI spans as V2 envelope items #6079

Uh oh!

Uh oh!

feat: Send GenAI spans as V2 envelope items #6079

Uh oh!

9 issues

High

Medium

Annotations

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

Re-running checks...

feat: Send GenAI spans as V2 envelope items #6079

Are you sure you want to change the base?

Uh oh!

openai-agents tests

Uh oh!

feat: Send GenAI spans as V2 envelope items #6079

Uh oh!

9 issues

High

Medium

Annotations

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

sentry-warden / warden: code-review

Re-running checks...