Skip to content

Fix Python from_dict() round-trip for optional fields with schema defaults#1313

Open
stephentoub wants to merge 2 commits into
mainfrom
stephentoub/jubilant-meme
Open

Fix Python from_dict() round-trip for optional fields with schema defaults#1313
stephentoub wants to merge 2 commits into
mainfrom
stephentoub/jubilant-meme

Conversation

@stephentoub
Copy link
Copy Markdown
Collaborator

Fixes #1139.
Fixes #1140.
Fixes #1141.

Why

from_dict(to_dict(x)) was not the identity for three generated Python dataclasses whenever the affected optional field was None. The dataclass declares the field as T | None = None and to_dict() omits it when None, but from_dict() was substituting the JSON-Schema default value when the key was missing, silently mutating unset fields:

  • SessionTaskCompleteData.summary: None""
  • PermissionPromptRequest.action: NonePermissionPromptRequestMemoryAction.STORE
  • PermissionRequest.action: NonePermissionRequestMemoryAction.STORE

Any code that round-trips events (caching, replay, audit logs, dedup) or pattern-matches on None to mean "absent" was getting the wrong answer end-to-end. The bug was reproducible on main as of today.

What changed

Root cause is in scripts/codegen/python.ts: for optional fields with a schema default, the codegen was emitting obj.get("key", "<default>"), then passing that into from_union([from_none, parse_X], ...), which happily parsed the default and returned it. The dataclass default and the wire default were inconsistent, so the round-trip broke.

  • Dropped the defaultLiteral mechanism from both emitPyClass and emitPyFlatDiscriminatedUnion. from_dict() now uniformly emits obj.get(key), so missing/null keys flow through from_none and land at the dataclass-declared None. Removed the now-unused toPythonLiteral helper.
  • Regenerated python/copilot/generated/session_events.py via npm run generate:python. Exactly three call sites changed, all matching the bug reports.
  • Flipped the two codegen-level assertions in nodejs/test/python-codegen.test.ts that previously locked in the buggy output, and added negative assertions to prevent regression at the generator level.
  • Replaced test_schema_defaults_are_applied_for_missing_optional_fields in python/test_event_forward_compatibility.py (which asserted the bug as expected behavior) with two regression tests covering missing-key parsing, explicit-null parsing, and the full from_dict(to_dict(x)) round-trip for all three affected classes.

Surface area / non-obvious notes

  • Other languages (Go, .NET, Rust, TypeScript) are not affected. Their generators never read propSchema.default to construct deserialization fallbacks; they use real nullable types end-to-end.
  • Grep on the post-fix generated session_events.py and rpc.py confirms zero remaining obj.get("...", "...") patterns, so no other class was silently relying on baked-in schema defaults.
  • The test_schema_defaults_are_applied_for_missing_optional_fields test was the only consumer in the repo of the old behavior. Nothing in runtime code depended on request.action == STORE or task_complete.summary == "" for unset fields.
  • JSON Schema default is an annotation, not validation behavior, so applying it only during deserialization while ignoring it in the dataclass field and in to_dict() was internally inconsistent. This change makes Python deserialization symmetric with serialization and with the in-memory default.

Validation

  • python -m pytest test_event_forward_compatibility.py -> 10/10 pass (including new regression tests).
  • Broader Python suite: 136/136 non-E2E tests pass.
  • npx vitest run on python-codegen.test.ts and other non-CLI test files: pass.
  • ruff check, ruff format --check, ty check copilot, tsc --noEmit, eslint: clean.
  • Manual reproduction of each issue's repro script on unmodified main (changes stashed) confirms the bug; with the fix applied, all three round-trip with equal? True.

Independent review by the code-review agent on claude-opus-4.7-xhigh found no significant issues and confirmed both codegen paths and the test coverage.

…aults

Fixes #1139, #1140, #1141.

The Python codegen was embedding JSON-Schema `default` values into
`obj.get(key, default)` for optional fields. But the generated
dataclass field is always `T | None = None` and `to_dict()` omits
the field when `None`, so `from_dict(to_dict(x))` silently mutated
unset fields into the schema default:

- `SessionTaskCompleteData.summary`: `None` -> `""`
- `PermissionPromptRequest.action`: `None` -> `MemoryAction.STORE`
- `PermissionRequest.action`: `None` -> `MemoryAction.STORE`

Drop `defaultLiteral` from both `emitPyClass` and
`emitPyFlatDiscriminatedUnion` so `from_dict()` always uses
`obj.get(key)` (matching the dataclass default). Regenerate
`session_events.py`. Flip the two codegen-level test assertions that
previously locked in the buggy output and add negative assertions.
Replace `test_schema_defaults_are_applied_for_missing_optional_fields`
(which asserted the bug as expected behavior) with regression tests
covering missing-key parsing, explicit-null parsing, and full
`from_dict(to_dict(x))` round-trips for all three affected classes.

Other languages (Go, .NET, Rust, TypeScript) are unaffected; their
generators never read `propSchema.default` for deserialization
fallbacks.

Co-authored-by: Copilot <[email protected]>
Copilot AI review requested due to automatic review settings May 16, 2026 21:57
@stephentoub stephentoub requested a review from a team as a code owner May 16, 2026 21:57
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes Python generated event-model deserialization so that missing optional fields no longer silently substitute JSON-Schema default values, restoring from_dict(to_dict(x)) identity when those fields are None (addressing #1139, #1140, #1141).

Changes:

  • Updated Python codegen to stop emitting obj.get(key, <schema default>) fallbacks during from_dict() generation and always use obj.get(key).
  • Regenerated python/copilot/generated/session_events.py to remove the three problematic defaulted obj.get(...) call sites.
  • Updated generator-level (Node) and Python regression tests to lock in correct missing-key / explicit-null / round-trip behavior.
Show a summary per file
File Description
scripts/codegen/python.ts Removes schema-default fallback emission in from_dict() generation (both class and flat discriminated-union paths).
python/copilot/generated/session_events.py Regenerated output removing defaulted obj.get(..., default) for the affected optional fields.
nodejs/test/python-codegen.test.ts Adjusts assertions to expect obj.get(key) and adds negative assertions preventing regression.
python/test_event_forward_compatibility.py Replaces the prior test that encoded the buggy behavior with regression tests for missing/null parsing and from_dict(to_dict(x)) round-trip.

Copilot's findings

  • Files reviewed: 3/4 changed files
  • Comments generated: 0

@github-actions github-actions Bot mentioned this pull request May 16, 2026
@stephentoub
Copy link
Copy Markdown
Collaborator Author

@brettcannon would you mind looking at this? Is it the right fix?

The CI prettier check failed on test/python-codegen.test.ts after the assertion update. Apply prettier --write to bring the file back into compliance.

Co-authored-by: Copilot <[email protected]>
@github-actions
Copy link
Copy Markdown
Contributor

Cross-SDK Consistency Review ✅

This PR fixes a Python-specific codegen bug and maintains cross-SDK consistency.

Analysis:

The changes touch only:

  • scripts/codegen/python.ts — removes the defaultLiteral mechanism that baked JSON Schema default values into Python deserialization
  • python/copilot/generated/session_events.py — regenerated output (3 call sites changed)
  • nodejs/test/python-codegen.test.ts — updated codegen test assertions
  • python/test_event_forward_compatibility.py — regression tests

Cross-language impact: None needed. The PR description correctly notes that Go, .NET, Rust, and TypeScript generators were never affected — they use real nullable types end-to-end and don't synthesize default fallbacks during deserialization. I verified that no other codegen script in scripts/codegen/ reads propSchema.default to construct deserialization fallbacks, and the post-fix session_events.py contains no remaining obj.get("key", default) patterns.

Consistency result: The fix actually brings Python into better alignment with the other SDKs — missing optional fields now uniformly return None rather than a schema-default value, matching the behavior users would expect from Go's nil, .NET's null, and TypeScript's undefined/null for unset optional fields.

No changes needed in other SDK implementations.

Generated by SDK Consistency Review Agent for issue #1313 · ● 146.1K ·

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants