KG-549 Add handler for JsonElement in JsonSchemaGenerator#1140
KG-549 Add handler for JsonElement in JsonSchemaGenerator#1140EugeneTheDev merged 5 commits intoJetBrains:developfrom
Conversation
|
@mltheuser Did you verify that it actually works when e.g. |
|
Hi @EugeneTheDev, valid concerns! Here is how I see it: Basically, JsonElement is the wildcard in Kotlin serialization. If a user includes it, they are explicitly asking for a dynamic structure. Mapping that to {} is the standard way to say "anything goes" in JSON Schema (see official docs). I added a test (testFixInvalidJsonElementContent) to verify that our fixing parser handles this correctly in Manual mode, even if the LLM produces messy JSON inside that wildcard field. As for Native mode: right now, the library crashes locally during generation. With this fix, we generate valid schema. If a strict provider doesn't support "Any" types, the API request will simply return a non-successful response. But since it is valid JSON Schema, they really should accept it. I think it's much better to let the provider accept or reject the schema than to crash on our side! |
5ab5657 to
125b4b2
Compare
|
Hi @EugeneTheDev, I wanted to bring this PR back to your attention. It's been a week and I would like to get this merged. |
There was a problem hiding this comment.
Sorry for the delay and thank you again for the proposal. I understand the idea now, but there are a few things I'd like to ask you to change, to make this new processJsonElement method aligned with the overall architecture.
First, let's make processJsonElement a proper memeber of the base abstract JsonSchemaGenerator class, just like the rest of the methods (processString, processNumber etc.), so that concrete generator implementations can provide their own custom logic.
Then, add a stub processJsonElement override in the GenericJsonSchemaGenerator that throws UnsupportedOperationException, just like it is done already for processClassDiscriminator and processPolymorphic.
The "routing logic", i.e., when this method should be invoked, should also be extracted from the processPolymorphic in the StandardJsonSchemaGenerator and moved to the process in the GenericJsonSchemaGenerator. It should likely look something like this
StructureKind.CLASS, StructureKind.OBJECT ->
if (context.descriptor.serialName in JSON_ELEMENT_SERIAL_NAMES) {
processJsonElement(context)
} else {
processObject(context)
}
Please note that JSON_ELEMENT_SERIAL_NAMES would also need to be moved to the GenericJsonSchemaGenerator.kt
And the actual implementation in the StandardJsonSchemaGenerator would stay almost the same, only private should be changed to override
I hope this explains the idea clear enough.
|
Thanks for the feedback, very clear! I have implemented the requested changes. |
8ff99f6 to
e6b594b
Compare
|
Strange, I can't recreate the failure. I will try to mirror the pipeline environment tomorrow, maybe that helps. |
e6b594b to
32b9ab8
Compare
401982b to
6996c46
Compare
|
Finally fixed it^^. |
EugeneTheDev
left a comment
There was a problem hiding this comment.
Thank you, looks good to me now
Related to: [KG-549](https://youtrack.jetbrains.com/issue/KG-549) ## Motivation and Context The `StandardJsonSchemaGenerator` fails with an `IllegalArgumentException` when attempting to generate a schema for any serializable class containing a `JsonElement` property. This occurs because the `processPolymorphic` method relies on an internal structural assumption about `kotlinx.serialization`. For a typical user-defined `sealed class`, the library generates a wrapper object in JSON containing a discriminator (e.g., `"type": "SubclassName"`) and a content field (e.g., `"value": {...}`). The `SerialDescriptor` for such a class mirrors this structure, presenting its elements as a `(type, value)` pair. The generator's current implementation hardcodes a check for this second `"value"` element to find the subtype definitions. However, `JsonElement` is a special-cased, primitive polymorphic type. It is not serialized within a wrapper; it serializes directly to its raw JSON representation. Consequently, its `SerialDescriptor` does not have the `(type, value)` structure. Instead, its elements are a direct list of its possible subtypes (`JsonObject`, `JsonArray`, etc.). This structural mismatch causes the generator's hardcoded check to fail, resulting in the `IllegalArgumentException`. This change introduces a special case to detect `JsonElement` and generates a permissive, empty schema (`{}`), which is the correct JSON Schema representation for "any valid JSON value". The fix is implemented in a type-safe manner and correctly handles both nullable (`JsonElement?`) and non-nullable properties. ## Breaking Changes None. This is a non-breaking bug fix that enables previously unsupported functionality. --- #### Type of the changes - [ ] New feature (non-breaking change which adds functionality) - [x] Bug fix (non-breaking change which fixes an issue) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Tests improvement - [ ] Refactoring #### Checklist - [x] The pull request has a description of the proposed change - [x] I read the [Contributing Guidelines](https://github.com/JetBrains/koog/blob/main/CONTRIBUTING.md) before opening the pull request - [x] The pull request uses **`develop`** as the base branch - [x] Tests for the changes have been added - [x] All new and existing tests passed ##### Additional steps for pull requests adding a new feature - [x] An issue describing the proposed change exists - [x] The pull request includes a link to the issue - [x] The change was discussed and approved in the issue - [ ] Docs have been added / updated --------- Co-authored-by: Malte Heuser <malte.heuser@ing.com>
Related to: [KG-549](https://youtrack.jetbrains.com/issue/KG-549) ## Motivation and Context The `StandardJsonSchemaGenerator` fails with an `IllegalArgumentException` when attempting to generate a schema for any serializable class containing a `JsonElement` property. This occurs because the `processPolymorphic` method relies on an internal structural assumption about `kotlinx.serialization`. For a typical user-defined `sealed class`, the library generates a wrapper object in JSON containing a discriminator (e.g., `"type": "SubclassName"`) and a content field (e.g., `"value": {...}`). The `SerialDescriptor` for such a class mirrors this structure, presenting its elements as a `(type, value)` pair. The generator's current implementation hardcodes a check for this second `"value"` element to find the subtype definitions. However, `JsonElement` is a special-cased, primitive polymorphic type. It is not serialized within a wrapper; it serializes directly to its raw JSON representation. Consequently, its `SerialDescriptor` does not have the `(type, value)` structure. Instead, its elements are a direct list of its possible subtypes (`JsonObject`, `JsonArray`, etc.). This structural mismatch causes the generator's hardcoded check to fail, resulting in the `IllegalArgumentException`. This change introduces a special case to detect `JsonElement` and generates a permissive, empty schema (`{}`), which is the correct JSON Schema representation for "any valid JSON value". The fix is implemented in a type-safe manner and correctly handles both nullable (`JsonElement?`) and non-nullable properties. ## Breaking Changes None. This is a non-breaking bug fix that enables previously unsupported functionality. --- #### Type of the changes - [ ] New feature (non-breaking change which adds functionality) - [x] Bug fix (non-breaking change which fixes an issue) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Tests improvement - [ ] Refactoring #### Checklist - [x] The pull request has a description of the proposed change - [x] I read the [Contributing Guidelines](https://github.com/JetBrains/koog/blob/main/CONTRIBUTING.md) before opening the pull request - [x] The pull request uses **`develop`** as the base branch - [x] Tests for the changes have been added - [x] All new and existing tests passed ##### Additional steps for pull requests adding a new feature - [x] An issue describing the proposed change exists - [x] The pull request includes a link to the issue - [x] The change was discussed and approved in the issue - [ ] Docs have been added / updated --------- Co-authored-by: Malte Heuser <malte.heuser@ing.com>
Related to: [KG-549](https://youtrack.jetbrains.com/issue/KG-549) ## Motivation and Context The `StandardJsonSchemaGenerator` fails with an `IllegalArgumentException` when attempting to generate a schema for any serializable class containing a `JsonElement` property. This occurs because the `processPolymorphic` method relies on an internal structural assumption about `kotlinx.serialization`. For a typical user-defined `sealed class`, the library generates a wrapper object in JSON containing a discriminator (e.g., `"type": "SubclassName"`) and a content field (e.g., `"value": {...}`). The `SerialDescriptor` for such a class mirrors this structure, presenting its elements as a `(type, value)` pair. The generator's current implementation hardcodes a check for this second `"value"` element to find the subtype definitions. However, `JsonElement` is a special-cased, primitive polymorphic type. It is not serialized within a wrapper; it serializes directly to its raw JSON representation. Consequently, its `SerialDescriptor` does not have the `(type, value)` structure. Instead, its elements are a direct list of its possible subtypes (`JsonObject`, `JsonArray`, etc.). This structural mismatch causes the generator's hardcoded check to fail, resulting in the `IllegalArgumentException`. This change introduces a special case to detect `JsonElement` and generates a permissive, empty schema (`{}`), which is the correct JSON Schema representation for "any valid JSON value". The fix is implemented in a type-safe manner and correctly handles both nullable (`JsonElement?`) and non-nullable properties. ## Breaking Changes None. This is a non-breaking bug fix that enables previously unsupported functionality. --- #### Type of the changes - [ ] New feature (non-breaking change which adds functionality) - [x] Bug fix (non-breaking change which fixes an issue) - [ ] Breaking change (fix or feature that would cause existing functionality to change) - [ ] Documentation update - [ ] Tests improvement - [ ] Refactoring #### Checklist - [x] The pull request has a description of the proposed change - [x] I read the [Contributing Guidelines](https://github.com/JetBrains/koog/blob/main/CONTRIBUTING.md) before opening the pull request - [x] The pull request uses **`develop`** as the base branch - [x] Tests for the changes have been added - [x] All new and existing tests passed ##### Additional steps for pull requests adding a new feature - [x] An issue describing the proposed change exists - [x] The pull request includes a link to the issue - [x] The change was discussed and approved in the issue - [ ] Docs have been added / updated --------- Co-authored-by: Malte Heuser <malte.heuser@ing.com>
Related to: KG-549
Motivation and Context
The
StandardJsonSchemaGeneratorfails with anIllegalArgumentExceptionwhen attempting to generate a schema for any serializable class containing aJsonElementproperty.This occurs because the
processPolymorphicmethod relies on an internal structural assumption aboutkotlinx.serialization. For a typical user-definedsealed class, the library generates a wrapper object in JSON containing a discriminator (e.g.,"type": "SubclassName") and a content field (e.g.,"value": {...}).The
SerialDescriptorfor such a class mirrors this structure, presenting its elements as a(type, value)pair. The generator's current implementation hardcodes a check for this second"value"element to find the subtype definitions.However,
JsonElementis a special-cased, primitive polymorphic type. It is not serialized within a wrapper; it serializes directly to its raw JSON representation. Consequently, itsSerialDescriptordoes not have the(type, value)structure. Instead, its elements are a direct list of its possible subtypes (JsonObject,JsonArray, etc.). This structural mismatch causes the generator's hardcoded check to fail, resulting in theIllegalArgumentException.This change introduces a special case to detect
JsonElementand generates a permissive, empty schema ({}), which is the correct JSON Schema representation for "any valid JSON value". The fix is implemented in a type-safe manner and correctly handles both nullable (JsonElement?) and non-nullable properties.Breaking Changes
None. This is a non-breaking bug fix that enables previously unsupported functionality.
Type of the changes
Checklist
developas the base branchAdditional steps for pull requests adding a new feature