Skip to content

KG-549 Add handler for JsonElement in JsonSchemaGenerator#1140

Merged
EugeneTheDev merged 5 commits intoJetBrains:developfrom
mltheuser:mltheuser/kg-549
Dec 3, 2025
Merged

KG-549 Add handler for JsonElement in JsonSchemaGenerator#1140
EugeneTheDev merged 5 commits intoJetBrains:developfrom
mltheuser:mltheuser/kg-549

Conversation

@mltheuser
Copy link
Copy Markdown
Contributor

Related to: KG-549

Motivation and Context

The StandardJsonSchemaGenerator fails with an IllegalArgumentException when attempting to generate a schema for any serializable class containing a JsonElement property.

This occurs because the processPolymorphic method relies on an internal structural assumption about kotlinx.serialization. For a typical user-defined sealed class, the library generates a wrapper object in JSON containing a discriminator (e.g., "type": "SubclassName") and a content field (e.g., "value": {...}).

The SerialDescriptor for such a class mirrors this structure, presenting its elements as a (type, value) pair. The generator's current implementation hardcodes a check for this second "value" element to find the subtype definitions.

However, JsonElement is a special-cased, primitive polymorphic type. It is not serialized within a wrapper; it serializes directly to its raw JSON representation. Consequently, its SerialDescriptor does not have the (type, value) structure. Instead, its elements are a direct list of its possible subtypes (JsonObject, JsonArray, etc.). This structural mismatch causes the generator's hardcoded check to fail, resulting in the IllegalArgumentException.

This change introduces a special case to detect JsonElement and generates a permissive, empty schema ({}), which is the correct JSON Schema representation for "any valid JSON value". The fix is implemented in a type-safe manner and correctly handles both nullable (JsonElement?) and non-nullable properties.

Breaking Changes

None. This is a non-breaking bug fix that enables previously unsupported functionality.


Type of the changes

  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Tests improvement
  • Refactoring

Checklist

  • The pull request has a description of the proposed change
  • I read the Contributing Guidelines before opening the pull request
  • The pull request uses develop as the base branch
  • Tests for the changes have been added
  • All new and existing tests passed
Additional steps for pull requests adding a new feature
  • An issue describing the proposed change exists
  • The pull request includes a link to the issue
  • The change was discussed and approved in the issue
  • Docs have been added / updated

@EugeneTheDev
Copy link
Copy Markdown
Collaborator

EugeneTheDev commented Nov 18, 2025

@mltheuser Did you verify that it actually works when e.g. StructuredRequest.Native is used? Are empty schemas allowed there? And even when you use StructuredRequest.Manual, i.e., a manual prompting, would LLMs be able to figure out the schema required? Isn't it kind of the point of a structured output to provide some rigid structure for the LLM to adhere to?

@mltheuser
Copy link
Copy Markdown
Contributor Author

Hi @EugeneTheDev, valid concerns! Here is how I see it:

Basically, JsonElement is the wildcard in Kotlin serialization. If a user includes it, they are explicitly asking for a dynamic structure. Mapping that to {} is the standard way to say "anything goes" in JSON Schema (see official docs).

I added a test (testFixInvalidJsonElementContent) to verify that our fixing parser handles this correctly in Manual mode, even if the LLM produces messy JSON inside that wildcard field.

As for Native mode: right now, the library crashes locally during generation. With this fix, we generate valid schema. If a strict provider doesn't support "Any" types, the API request will simply return a non-successful response. But since it is valid JSON Schema, they really should accept it. I think it's much better to let the provider accept or reject the schema than to crash on our side!

@mltheuser
Copy link
Copy Markdown
Contributor Author

Hi @EugeneTheDev, I wanted to bring this PR back to your attention. It's been a week and I would like to get this merged.

Copy link
Copy Markdown
Collaborator

@EugeneTheDev EugeneTheDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay and thank you again for the proposal. I understand the idea now, but there are a few things I'd like to ask you to change, to make this new processJsonElement method aligned with the overall architecture.

First, let's make processJsonElement a proper memeber of the base abstract JsonSchemaGenerator class, just like the rest of the methods (processString, processNumber etc.), so that concrete generator implementations can provide their own custom logic.

Then, add a stub processJsonElement override in the GenericJsonSchemaGenerator that throws UnsupportedOperationException, just like it is done already for processClassDiscriminator and processPolymorphic.

The "routing logic", i.e., when this method should be invoked, should also be extracted from the processPolymorphic in the StandardJsonSchemaGenerator and moved to the process in the GenericJsonSchemaGenerator. It should likely look something like this

StructureKind.CLASS, StructureKind.OBJECT ->
    if (context.descriptor.serialName in JSON_ELEMENT_SERIAL_NAMES) {
        processJsonElement(context)
    } else {
        processObject(context)
    }

Please note that JSON_ELEMENT_SERIAL_NAMES would also need to be moved to the GenericJsonSchemaGenerator.kt

And the actual implementation in the StandardJsonSchemaGenerator would stay almost the same, only private should be changed to override

I hope this explains the idea clear enough.

@mltheuser
Copy link
Copy Markdown
Contributor Author

mltheuser commented Dec 1, 2025

Thanks for the feedback, very clear! I have implemented the requested changes.

@mltheuser mltheuser force-pushed the mltheuser/kg-549 branch 2 times, most recently from 8ff99f6 to e6b594b Compare December 1, 2025 21:21
@mltheuser
Copy link
Copy Markdown
Contributor Author

mltheuser commented Dec 1, 2025

Strange, I can't recreate the failure. I will try to mirror the pipeline environment tomorrow, maybe that helps.

@mltheuser
Copy link
Copy Markdown
Contributor Author

Finally fixed it^^.

Copy link
Copy Markdown
Collaborator

@EugeneTheDev EugeneTheDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, looks good to me now

@EugeneTheDev EugeneTheDev changed the title KG-549 Add handler for JsonElement in StandardJsonSchemaGenerator.kt KG-549 Add handler for JsonElement in JsonSchemaGenerator Dec 3, 2025
@EugeneTheDev EugeneTheDev merged commit 0054d4a into JetBrains:develop Dec 3, 2025
10 checks passed
kpavlov pushed a commit that referenced this pull request Dec 5, 2025
Related to: [KG-549](https://youtrack.jetbrains.com/issue/KG-549)

## Motivation and Context
The `StandardJsonSchemaGenerator` fails with an
`IllegalArgumentException` when attempting to generate a schema for any
serializable class containing a `JsonElement` property.

This occurs because the `processPolymorphic` method relies on an
internal structural assumption about `kotlinx.serialization`. For a
typical user-defined `sealed class`, the library generates a wrapper
object in JSON containing a discriminator (e.g., `"type":
"SubclassName"`) and a content field (e.g., `"value": {...}`).

The `SerialDescriptor` for such a class mirrors this structure,
presenting its elements as a `(type, value)` pair. The generator's
current implementation hardcodes a check for this second `"value"`
element to find the subtype definitions.

However, `JsonElement` is a special-cased, primitive polymorphic type.
It is not serialized within a wrapper; it serializes directly to its raw
JSON representation. Consequently, its `SerialDescriptor` does not have
the `(type, value)` structure. Instead, its elements are a direct list
of its possible subtypes (`JsonObject`, `JsonArray`, etc.). This
structural mismatch causes the generator's hardcoded check to fail,
resulting in the `IllegalArgumentException`.

This change introduces a special case to detect `JsonElement` and
generates a permissive, empty schema (`{}`), which is the correct JSON
Schema representation for "any valid JSON value". The fix is implemented
in a type-safe manner and correctly handles both nullable
(`JsonElement?`) and non-nullable properties.

## Breaking Changes
None. This is a non-breaking bug fix that enables previously unsupported
functionality.

---

#### Type of the changes
- [ ] New feature (non-breaking change which adds functionality)
- [x] Bug fix (non-breaking change which fixes an issue)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Tests improvement
- [ ] Refactoring

#### Checklist
- [x] The pull request has a description of the proposed change
- [x] I read the [Contributing
Guidelines](https://github.com/JetBrains/koog/blob/main/CONTRIBUTING.md)
before opening the pull request
- [x] The pull request uses **`develop`** as the base branch
- [x] Tests for the changes have been added
- [x] All new and existing tests passed

##### Additional steps for pull requests adding a new feature
- [x] An issue describing the proposed change exists
- [x] The pull request includes a link to the issue
- [x] The change was discussed and approved in the issue
- [ ] Docs have been added / updated

---------

Co-authored-by: Malte Heuser <malte.heuser@ing.com>
sdubov pushed a commit that referenced this pull request Dec 5, 2025
Related to: [KG-549](https://youtrack.jetbrains.com/issue/KG-549)

## Motivation and Context
The `StandardJsonSchemaGenerator` fails with an
`IllegalArgumentException` when attempting to generate a schema for any
serializable class containing a `JsonElement` property.

This occurs because the `processPolymorphic` method relies on an
internal structural assumption about `kotlinx.serialization`. For a
typical user-defined `sealed class`, the library generates a wrapper
object in JSON containing a discriminator (e.g., `"type":
"SubclassName"`) and a content field (e.g., `"value": {...}`).

The `SerialDescriptor` for such a class mirrors this structure,
presenting its elements as a `(type, value)` pair. The generator's
current implementation hardcodes a check for this second `"value"`
element to find the subtype definitions.

However, `JsonElement` is a special-cased, primitive polymorphic type.
It is not serialized within a wrapper; it serializes directly to its raw
JSON representation. Consequently, its `SerialDescriptor` does not have
the `(type, value)` structure. Instead, its elements are a direct list
of its possible subtypes (`JsonObject`, `JsonArray`, etc.). This
structural mismatch causes the generator's hardcoded check to fail,
resulting in the `IllegalArgumentException`.

This change introduces a special case to detect `JsonElement` and
generates a permissive, empty schema (`{}`), which is the correct JSON
Schema representation for "any valid JSON value". The fix is implemented
in a type-safe manner and correctly handles both nullable
(`JsonElement?`) and non-nullable properties.

## Breaking Changes
None. This is a non-breaking bug fix that enables previously unsupported
functionality.

---

#### Type of the changes
- [ ] New feature (non-breaking change which adds functionality)
- [x] Bug fix (non-breaking change which fixes an issue)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Tests improvement
- [ ] Refactoring

#### Checklist
- [x] The pull request has a description of the proposed change
- [x] I read the [Contributing
Guidelines](https://github.com/JetBrains/koog/blob/main/CONTRIBUTING.md)
before opening the pull request
- [x] The pull request uses **`develop`** as the base branch
- [x] Tests for the changes have been added
- [x] All new and existing tests passed

##### Additional steps for pull requests adding a new feature
- [x] An issue describing the proposed change exists
- [x] The pull request includes a link to the issue
- [x] The change was discussed and approved in the issue
- [ ] Docs have been added / updated

---------

Co-authored-by: Malte Heuser <malte.heuser@ing.com>
vova-jb pushed a commit that referenced this pull request Jan 27, 2026
Related to: [KG-549](https://youtrack.jetbrains.com/issue/KG-549)

## Motivation and Context
The `StandardJsonSchemaGenerator` fails with an
`IllegalArgumentException` when attempting to generate a schema for any
serializable class containing a `JsonElement` property.

This occurs because the `processPolymorphic` method relies on an
internal structural assumption about `kotlinx.serialization`. For a
typical user-defined `sealed class`, the library generates a wrapper
object in JSON containing a discriminator (e.g., `"type":
"SubclassName"`) and a content field (e.g., `"value": {...}`).

The `SerialDescriptor` for such a class mirrors this structure,
presenting its elements as a `(type, value)` pair. The generator's
current implementation hardcodes a check for this second `"value"`
element to find the subtype definitions.

However, `JsonElement` is a special-cased, primitive polymorphic type.
It is not serialized within a wrapper; it serializes directly to its raw
JSON representation. Consequently, its `SerialDescriptor` does not have
the `(type, value)` structure. Instead, its elements are a direct list
of its possible subtypes (`JsonObject`, `JsonArray`, etc.). This
structural mismatch causes the generator's hardcoded check to fail,
resulting in the `IllegalArgumentException`.

This change introduces a special case to detect `JsonElement` and
generates a permissive, empty schema (`{}`), which is the correct JSON
Schema representation for "any valid JSON value". The fix is implemented
in a type-safe manner and correctly handles both nullable
(`JsonElement?`) and non-nullable properties.

## Breaking Changes
None. This is a non-breaking bug fix that enables previously unsupported
functionality.

---

#### Type of the changes
- [ ] New feature (non-breaking change which adds functionality)
- [x] Bug fix (non-breaking change which fixes an issue)
- [ ] Breaking change (fix or feature that would cause existing
functionality to change)
- [ ] Documentation update
- [ ] Tests improvement
- [ ] Refactoring

#### Checklist
- [x] The pull request has a description of the proposed change
- [x] I read the [Contributing
Guidelines](https://github.com/JetBrains/koog/blob/main/CONTRIBUTING.md)
before opening the pull request
- [x] The pull request uses **`develop`** as the base branch
- [x] Tests for the changes have been added
- [x] All new and existing tests passed

##### Additional steps for pull requests adding a new feature
- [x] An issue describing the proposed change exists
- [x] The pull request includes a link to the issue
- [x] The change was discussed and approved in the issue
- [ ] Docs have been added / updated

---------

Co-authored-by: Malte Heuser <malte.heuser@ing.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants