Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .agents/rules/ecs-schema-standards.mdc
Original file line number Diff line number Diff line change
Expand Up @@ -33,3 +33,8 @@ List of mappings; each needs `relation`: `match`, `equivalent`, `related`, `conf
- If `expected_values` is set, `example` must be consistent with allowed values.

After edits: run `make` and commit `generated/` and generated `docs/reference/` as required by the repo.

## Automated PR quality review

For Cursor / agent workflows aimed at reviewer-style feedback (naming, descriptions, examples, duplicates vs base inventory), follow [ecs-pr-review skill](../skills/ecs-pr-review/SKILL.md) and [quality-rules.md](../skills/ecs-pr-review/quality-rules.md). GitHub Actions runs [.github/workflows/pr-review.yml](../../.github/workflows/pr-review.yml).

75 changes: 75 additions & 0 deletions .agents/skills/ecs-pr-review/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
---
name: ecs-pr-review
description: >-
Reviews ECS pull requests for schema quality before human review: naming,
description clarity, type strictness, example/pattern/expected_values
consistency, OTel hints, and duplication/conflict checks against a base schema
inventory. Produces severity-graded (High/Medium/Low) actionable findings.
---

# ECS PR quality review

This skill reviews **schema and related changes** in an ECS pull request. It complements **ecs-pr-triage** (routing): triage classifies *where* a change should go; this skill checks *how well* fields are defined.

## Inputs

Use what is available in context, or fetch via `gh`:

- `gh pr view <N> --repo <owner/repo> --json title,body,files,additions,deletions,baseRefName,headRefName`
- `gh pr diff <N> --repo <owner/repo>`
- **Schema inventory** at `schema-inventory.tsv` (or equivalent) when provided — flattened field paths from the **base** branch for collision/duplicate reasoning.

## Execution steps

### 1. Scope the review

- Identify files in the diff; prioritize `schemas/**/*.yml`.
- Note any changes under `generated/`, `docs/reference/ecs-*.md`, `scripts/`, `rfcs/`, or `CHANGELOG.next.md` for process rules (see [quality-rules.md](quality-rules.md) **GEN-** rules).

### 2. Parse schema-relevant hunks

- For each `schemas/*.yml` hunk, extract **new or modified field entries** (including nested `fields:`) and the **field set** context (`name`, `root`, `reusable` if present).
- Build the **flattened field path** for each leaf the same way as ECS docs (e.g. `agent.version`, `process.parent.pid`, `@timestamp` for root `base`).

### 3. Apply quality rules

Walk [quality-rules.md](quality-rules.md) **by category** (types, stability, naming, descriptions, examples, structure, duplicates, generated files, OTel):

- Map each issue to a **Rule ID** (e.g. `TYP-H01`, `NM-M02`).
- Assign **High**, **Medium**, or **Low** per the table and §10 of [quality-rules.md](quality-rules.md).
- **Evidence** must cite the diff line or inventory row (field path + file).

### 4. Duplication and conflicts (inventory)

With **schema-inventory.tsv**:

- **Columns (expected):** `flat_name`, `type`, `level`, `fieldset`, `source_file`, `description_snippet` (tab-separated).
- **DUP-H01 / DUP-M01:** compare new/changed paths and semantics against inventory; flag collisions and near-duplicates.
- If inventory is missing, still run **structural** checks from the diff only, and state that collision checks were **limited**.

### 5. Write the report

Fill [report-template.md](report-template.md):

- Start with `# ECS PR Quality Review (automated)` as the title line (H1).
- Immediately after the title block, include **`Overall:`** with **`High: N | Medium: M | Low: L`** (counts inferred from findings).
- Include **`### Summary details`** as 2–4 bullets capturing top themes next (GitHub “summary details” section content).
- Use `<details><summary>…</summary>` **per severity bucket** as in the template.
- End with contributor pointers (CONTRIBUTING, guidelines, schemas README).

### 6. Behaviour constraints

- **No merge authority:** do not approve, request GitHub Review states, or block CI.
- **Actionable fixes:** each finding ends with concrete YAML or process steps.
- Prefer **fewer, accurate** findings over noisy low-value **Low** items.

## Related assets

- [quality-rules.md](quality-rules.md) — canonical rule IDs and severities.
- [report-template.md](report-template.md) — exact output structure.

## Repo rules cross-reference

- [ecs-schema-standards.mdc](../../rules/ecs-schema-standards.mdc)
- [ecs-pr-completeness.mdc](../../rules/ecs-pr-completeness.mdc)
- **Routing / RFC:** still use [ecs-pr-triage](../ecs-pr-triage/SKILL.md) when the question is *Direct PR vs RFC*.
112 changes: 112 additions & 0 deletions .agents/skills/ecs-pr-review/quality-rules.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
# ECS PR Quality Review — quality rules

Use these rules when reviewing **schema YAML** under `schemas/**/*.yml` and related changes in a pull request. Each rule has an **ID**, **severity** (**High** / **Medium** / **Low**), and **remediation** guidance.

**Sources of truth (read before inferring):** [schemas/README.md](../../../schemas/README.md), [ecs-schema-standards.mdc](../../rules/ecs-schema-standards.mdc), [docs/reference/ecs-guidelines.md](../../../docs/reference/ecs-guidelines.md), [docs/reference/ecs-conventions.md](../../../docs/reference/ecs-conventions.md), [CONTRIBUTING.md](../../../CONTRIBUTING.md).

**Allowed field `type` values** (must match Elasticsearch mapping types used in ECS):
`binary`, `boolean`, `keyword`, `constant_keyword`, `wildcard`, `long`, `integer`, `short`, `byte`, `double`, `float`, `half_float`, `scaled_float`, `unsigned_long`, `date`, `date_nanos`, `alias`, `object`, `flattened`, `nested`, `join`, `long_range`, `double_range`, `date_range`, `ip`, `text`, `match_only_text`, `geo_point`, `geo_shape`, `point`, `shape`.

**Allowed `level` values:** `core`, `extended` (`custom` may appear in legacy contexts; prefer `core`/`extended` for new work per repo tests and docs).

---

## 1. Types and required keys

| ID | Severity | Summary | Detection | Remediation |
|----|----------|---------|-----------|-------------|
| **TYP-H01** | High | Invalid `type` | `type` is missing or not in the allowed list above. | Set a valid ECS/ES type (see [ecs-conventions](../../../docs/reference/ecs-conventions.md) for integer/text defaults). |
| **TYP-H02** | High | Invalid `level` | `level` missing or not `core` / `extended`. | Set `level` per field role in the schema. |
| **TYP-H03** | High | Missing required field attributes | Leaf field missing `name`, `description`, `type`, or `level`. | Add all required keys per [schemas/README.md](../../../schemas/README.md). |
| **TYP-H04** | High | `alias` without `path` | `type: alias` and no `path`. | Add `path` to the canonical target field. |
| **TYP-H05** | High | `scaled_float` without `scaling_factor` | `type: scaled_float` and no `scaling_factor`. | Add `scaling_factor` (see existing fields for examples). |

---

## 2. Stability markers

| ID | Severity | Summary | Detection | Remediation |
|----|----------|---------|-----------|-------------|
| **STB-H01** | High | `alpha` and `beta` together | Same field or field set has both `alpha` and `beta`. | Remove one marker or split the change per [schemas/README.md](../../../schemas/README.md). |
| **STB-M01** | Medium | Multiline `alpha` / `beta` text | Newline inside `alpha` or `beta` string. | Use a single line (strict/cleaner expectation). |

---

## 3. Naming

| ID | Severity | Summary | Detection | Remediation |
|----|----------|---------|-----------|-------------|
| **NM-M01** | Medium | Name token shape | Leaf segment uses uppercase, spaces, or characters other than `[a-z0-9_.@]` (allow `@` only where existing convention, e.g. `@timestamp`). | Use lowercase, underscores; follow [ecs-guidelines](../../../docs/reference/ecs-guidelines.md). |
| **NM-M02** | Medium | Stuttering | Field name repeats the field-set prefix (e.g. `host.host_ip` → should be `host.ip`) or obvious repetition. | Rename to avoid stutter; document rare exceptions (e.g. `host.hostname`). |
| **NM-M03** | Medium | Dubious abbreviation | Abbreviation not in common ECS exceptions (`ip`, `os`, `geo`, `id`, `url`, `http`, `https`, `dns`, `tls`, `api`, `pid`, `uuid`, `sql`, `csv`, `as` field set name, etc.). | Prefer full words unless strongly conventional. |
| **NM-L01** | Low | Minor naming polish | Wording could align better with sibling fields (`.id` vs `.name` patterns). | Align with nearby fields in the same field set. |

---

## 4. Descriptions and `short`

| ID | Severity | Summary | Detection | Remediation |
|----|----------|---------|-----------|-------------|
| **DSC-M01** | Medium | Multi-paragraph `description` without `short` | `description` has two+ paragraphs (blank line) and no `short`. | Add a single-line `short` per [schemas/README.md](../../../schemas/README.md). |
| **DSC-M02** | Medium | `short` length / shape | `short` longer than 120 characters or contains newlines. | Trim to ≤120 chars, single line (strict mode / [ecs-schema-standards](../../rules/ecs-schema-standards.mdc)). |
| **DSC-M03** | Medium | Vague or self-referential description | Description only restates the field name with no semantics, or contradicts `type`/`example`. | Clarify meaning, units, population rules, and boundaries. |
| **DSC-L01** | Low | Long single-paragraph description | Very long description without `short` (optional clarity for UIs). | Add `short` for scanability. |

---

## 5. Examples, `pattern`, `expected_values`

| ID | Severity | Summary | Detection | Remediation |
|----|----------|---------|-----------|-------------|
| **EX-M01** | Medium | Unquoted composite `example` | `example` is YAML array or mapping (not a quoted string) where composite should be stringified for generators. | Quote as a string (see [schemas/README.md](../../../schemas/README.md)). |
| **EX-M02** | Medium | `example` vs `pattern` | `pattern` set and example string does not `re.match` the pattern. | Fix example or pattern so they agree ([`cleaner.check_example_value`](../../../scripts/schema/cleaner.py)). |
| **EX-M03** | Medium | `example` vs `expected_values` | `expected_values` list present and example not in list (after array-normalize for `normalize: array`). | Align example with a listed value or adjust `expected_values`. |
| **EX-L01** | Low | Missing `example` on new leaf | New leaf field has no `example` where one would aid implementers. | Add a realistic `example`. |

---

## 6. Structural: `object`, `flattened`, reuse

| ID | Severity | Summary | Detection | Remediation |
|----|----------|---------|-----------|-------------|
| **STR-M01** | Medium | `object` / `flattened` without children | `type` is `object` or `flattened`, no child `fields`, PR text does not justify opaque shape per [schemas/README.md](../../../schemas/README.md). | Model explicit children OR document homogeneous opaque maps (e.g. string keys/values). |
| **STR-H01** | High | Breaking reuse assumptions | Diff changes `reusable.expected`/`top_level` in ways that break established mounting or duplicates trees incorrectly. | Re-check [schemas/README.md](../../../schemas/README.md) reuse section and RFC needs. |

---

## 7. Duplication and conflicts vs inventory

The review agent receives a **`schema-inventory.tsv`** (or similar) built from **`schemas/*.yml` at the PR base**. It lists flattened field names (`agent.name`, `@timestamp`, …) and types/descriptions snippets for comparison.

| ID | Severity | Summary | Detection | Remediation |
|----|----------|---------|-----------|-------------|
| **DUP-H01** | High | Canonical name collision | New/changed flattened field path duplicates an inventory path that already refers to another distinct definition (excluding identical line from same merged entry). If PR only moves within same RFC flow, reconcile in schema. | Rename or consolidate; avoid conflicting definitions across field sets/locations. |
| **DUP-M01** | Medium | Semantic overlap | New field’s description/examples overlap an existing inventory field (`host.ip` vs new `host.ipv4`) without differentiation. | Reference existing fields, extend with clearer names, or merge concepts. |
| **DUP-M02** | Medium | Duplicate `allowed_values` / category | New `allowed_values` entry name collides or duplicates semantics of an existing entry. | Merge or rename; consider consumers of categorization docs. |

---

## 8. Generated artifacts and docs

| ID | Severity | Summary | Detection | Remediation |
|----|----------|---------|-----------|-------------|
| **GEN-H01** | High | Hand-edited generated reference | Diff touches `generated/**` or generator-only `docs/reference/ecs-*.md` without corresponding `schemas/**` (or documented generator-only reason). | Revert manual edits; run `make` / `make check` per [CONTRIBUTING.md](../../../CONTRIBUTING.md). |
| **GEN-M01** | Medium | Changelog expectation | Schema or `scripts/` change with no `CHANGELOG.next.md` update when one is expected per [ecs-pr-completeness](../../rules/ecs-pr-completeness.mdc). | Add entry with `#NNNN` when required. |

---

## 9. OpenTelemetry mappings

| ID | Severity | Summary | Detection | Remediation |
|----|----------|---------|-----------|-------------|
| **OTL-M01** | Medium | Likely semconv alignment missing | Field clearly parallels a well-known OTel attribute (naming/type) and has no `otel:` block; optional but recommended in [CONTRIBUTING.md](../../../CONTRIBUTING.md). | Add `otel:` with correct `relation` or `relation: na` with justification if intentionally unmapped. |

---

## 10. Severity guidelines for the agent

- **High:** Breaks schema contract, will or should fail generator/tests, illegal attributes, dangerous collisions, or edits generated-only outputs incorrectly.
- **Medium:** Violates ECS conventions, strict cleaner rules, unclear docs, likely maintainer pushback.
- **Low:** Polish, optional examples, stylistic consistency, non-blocking suggestions.

When uncertain between two severities, prefer the **higher** one in the checklist and explain why in one sentence.
81 changes: 81 additions & 0 deletions .agents/skills/ecs-pr-review/report-template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# ECS PR Quality Review — report template

Copy and fill for every automated review run. Produce **filled GitHub-flavored Markdown** (not placeholders). Wrap each severity bucket in `<details>` for the PR comment unless the tooling posts sections separately.

````markdown
# ECS PR Quality Review (automated)

**PR:** #[N] — [title]

**Overall:** **High:** [H] | **Medium:** [M] | **Low:** [L]
**Scope:** Focus on changed `schemas/**/*.yml` and schema-related tooling/docs in this PR — other files may be listed only if relevant.

### Summary details

Brief bullet list ([2–4] bullets):
- Highest-severity themes (e.g. “invalid type”, “collision with inventory”, “missing short”).
- Whether `make check` / strict generator would likely fail.
- One line on duplication or overlap with existing fields (or “none found”).

---

<details>
<summary>High Severity ([H] findings)</summary>

_Use `### HX — [Rule-ID]: [short title]` per finding (X numbering within this section)._

### H1 — [RULE-ID]: [Short title]

- **Field / path:** `[flat field name or field set]`
- **File:** `[path]`
- **Rule:** One sentence tying to ecs-pr-review quality-rules.
- **Evidence:** Quote or paraphrase the diff / inventory line.
- **Fix:** Concrete steps (edit keys, rename, revert generated file, etc.).

_(Repeat for each High finding.)_

_If none: **No High-severity findings.**_

</details>

<details>
<summary>Medium Severity ([M] findings)</summary>

### M1 — [RULE-ID]: [Short title]

- **Field / path:** `...`
- **File:** `...`
- **Rule:** ...
- **Evidence:** ...
- **Fix:** ...

_(Repeat.)_

_If none: **No Medium-severity findings.**_

</details>

<details>
<summary>Low Severity ([L] findings)</summary>

### L1 — [RULE-ID]: [Short title]

- **Field / path:** `...`
- **File:** `...`
- **Suggestion:** ...

_(Repeat.)_

_If none: **No Low-severity findings.**_

</details>

---

### Notes for contributors

- Regenerate ECS artifacts after schema edits: **`make`** (or `make check` before push) per [CONTRIBUTING.md](../../../CONTRIBUTING.md).
- Naming and types: [ecs-guidelines](../../../docs/reference/ecs-guidelines.md), [ecs-conventions](../../../docs/reference/ecs-conventions.md), [schemas/README](../../../schemas/README.md).
- Automated review does **not** replace maintainer review or CI.

````
Loading
Loading