Skip to content

Fix GenAI numbers datatype#2646

Open
smortex wants to merge 3 commits intoelastic:mainfrom
smortex:fix-gen_ai-integers
Open

Fix GenAI numbers datatype#2646
smortex wants to merge 3 commits intoelastic:mainfrom
smortex:fix-gen_ai-integers

Conversation

@smortex
Copy link
Copy Markdown
Contributor

@smortex smortex commented May 9, 2026

1. What does this PR do?

According to the ECS conventions, the datatype for integers (numbers) should be long:
https://www.elastic.co/docs/reference/ecs/ecs-conventions#_datatype_for_integers

When GenAI fields where added in #2475, a bunch of integer fields where added, but their datatype was not discussed. This introduced some inconsistency, and without a strong reason for preferring integer to long, breaks the ECS conventions.

This PR fix this by switching to long datatypes as expected by the ECS conventions.

2. Which ECS fields are affected/introduced?

  • genai.request.choice.count
  • genai.request.max_tokens
  • genai.request.seed
  • genai.usage.input_tokens
  • genai.usage.output_tokens

3. Why is this change necessary?

Compliance with the ECS conventions.

4. Have you added/updated documentation?

N/A

5. Have you built ECS and committed any newly generated files?

YES

6. Have you run the ECS validation tests locally?

YES

7. Anything else for the reviewers?

Thank you!


Commit Message

Fix GenAI numbers datatype

According to the ECS conventions, the datatype for integers (numbers)
should be long:
https://www.elastic.co/docs/reference/ecs/ecs-conventions#_datatype_for_integers

When GenAI fields where added in #2475, a bunch of integer fields
where added, but their datatype was not discussed. This introduced some
inconsistency, and without a strong reason for preferring integer to
long, breaks the ECS conventions.

Update these fields to use the long datatype as expected by the ECS
conventions.

@smortex smortex requested a review from a team May 9, 2026 05:05
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 9, 2026

Documentation changes preview: https://docs-v3-preview.elastic.dev/elastic/ecs/pull/2646/reference/

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 9, 2026

🤖 GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 9, 2026

ECS PR Triage (automated)

PR Triage Report

PR: #2646 — Fix GenAI numbers datatype
Author: @smortex (Romain Tartiere)
Classification: Needs Discussion
Change type: Schema Change
Scope: Moderate

Summary

This PR changes the type of 5 GenAI integer fields from integer to long in schemas/gen_ai.yml to comply with the ECS conventions for integer datatypes, which state that the default integer type should be long. The fields were originally added as integer in #2475 without explicit discussion of the datatype. All generated artifacts and docs have been regenerated accordingly. While integerlong is a widening (non-breaking) type change and the affected fields are all beta maturity, changing type: on existing fields is flagged as a potential breaking change per classification rules — maintainer judgment is needed to confirm this is safe to merge as a direct bugfix.

Files changed

  • Schemas: schemas/gen_ai.yml (5 fields: type: integertype: long)
  • Generated: generated/beats/fields.ecs.yml, generated/csv/fields.csv, generated/ecs/ecs_flat.yml, generated/ecs/ecs_nested.yml, generated/elasticsearch/composable/component/gen_ai.json, generated/elasticsearch/legacy/template.json, and corresponding experimental/generated/ counterparts (12 generated files total)
  • Tooling/scripts/tests: none
  • Docs (hand-authored): none
  • Docs (generated): docs/reference/ecs-gen_ai.md (regenerated — consistent with schema change)
  • CI / GitHub: none
  • RFCs: none

Routing decision

The change modifies type: on 5 existing fields, which per classification-rules §1 ("changing type") is listed as a breaking change trigger that would normally require an RFC. However, several mitigating factors make this borderline rather than a clear RFC case:

  1. Widening, not narrowing: integerlong is a compatible type widening in Elasticsearch. Existing indexed data remains valid; existing queries and aggregations continue to work. No data loss or semantic incompatibility results.
  2. Beta maturity: All 5 affected fields carry beta: This field is beta and subject to change. — beta fields are explicitly allowed to change before GA.
  3. Convention compliance bugfix: The ECS conventions document explicitly states integers should use long. The original integer type appears to have been an oversight in [RFC 0050] Stage 2: Introducing GenAI fields #2475 rather than a deliberate design choice.

Given these factors, this falls between §1 (RFC required — type change) and §2 (Direct PR — bugfix/strict-mode fix). The conservative classification is Needs Discussion: a maintainer should confirm that this non-breaking type widening on beta fields is acceptable as a direct PR without an RFC.

Risk notes

  • Breaking / deprecation: Technically a type: change, but non-breaking in practicelong is a strict superset of integer in Elasticsearch. No existing data, mappings, or queries are invalidated. No deprecation involved.
  • OTel / semconv: N/A — no OTel mapping changes. The otel: metadata on these fields is unchanged and remains correct (the OTel semconv does not prescribe Elasticsearch mapping types).
  • Scope / reuse: No new fields, no new field set, no reuse topology changes. Affects only the gen_ai field set's existing fields. All fields remain extended level and beta maturity.

Completeness checklist

  • PR description (all sections) — All 7 template sections are filled with substantive answers. Clear explanation of what changed, why, and confirmation that ECS was built and tests were run.
  • CHANGELOG.next.md — Missing. Since schemas/gen_ai.yml is modified, a CHANGELOG.next.md entry is required. It should be added under Schema Changes > Bugfixes (or Changed, depending on maintainer preference) with #2646.
  • make + committed generated outputs — All expected generated artifacts are present in the diff (generated/, experimental/generated/, docs/reference/ecs-gen_ai.md). The regenerated outputs are consistent with the schema change (only integerlong replacements).
  • OTel otel: on new/changed semconv-related fields — N/A (no new fields; existing otel: metadata unchanged and correct).
  • Tests / make check — PR author states tests were run locally. CI will verify.
  • CLA (contributor) — Not verifiable from diff; GitHub CLA bot will check.

Recommended next actions

  1. Maintainer: Confirm that integerlong on beta fields is acceptable as a direct PR (bugfix) without an RFC, given the non-breaking widening nature and ECS convention compliance. If agreed, reclassify as Direct PR.
  2. Contributor: Add a CHANGELOG.next.md entry. Suggested placement under Schema Changes > Bugfixes:
    - Fix GenAI integer fields to use `long` datatype per ECS conventions. #2646
    
  3. Contributor: Ensure the Elastic CLA is signed (if not already).
  4. CI: Verify make check passes in CI to confirm generated artifacts are consistent.

Posted by PR Triage workflow

smortex added 2 commits May 8, 2026 19:07
According to the ECS conventions, the datatype for integers (numbers)
should be `long`:
https://www.elastic.co/docs/reference/ecs/ecs-conventions#_datatype_for_integers

When GenAI fields where added in elastic#2475, a bunch of `integer` fields
where added, but their datatype was not discussed.  This introduced some
inconsistency, and without a strong reason for preferring `integer` to
`long`, breaks the ECS conventions.

Update these fields to use the `long` datatype as expected by the ECS
conventions.
Regenerate artifacts after the last commit to keep them in sync in the
repo.
@smortex smortex force-pushed the fix-gen_ai-integers branch from fd9c735 to 45a9736 Compare May 9, 2026 05:13
@smortex
Copy link
Copy Markdown
Contributor Author

smortex commented May 9, 2026

Sorry, I messed my upstreams before working on this PR and pushed garbage… I fixed my commits and force-pushed, this is ready for review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant