Skip to content

Commit 37f46f8

Browse files
carsonipclaude
andcommitted
ci: tighten drift workflow for readability and maintainability
- Single source of truth for the apm-server config: write it to a file in the CI run, then read it back (with path.data rewritten) when generating the recipe in the drift issue. The CI run and the recipe can no longer disagree about what config produced the captured /stats. - Split "Run apm-server, capture /stats" into "Write apm-server config" + "Run apm-server, capture /stats". The split makes the config the source-of-truth artifact other steps reference. - Replace the on-disk PID file with a bash variable plus an EXIT trap that cleans up apm-server even if curl fails. - Hoist STATS_PORT to a job-level env so the magic 15066 lives in one place. - Recipe: `export PATH="$PATH:$(go env GOPATH)/bin"` after `go install`, since contributors without that already in PATH would otherwise see "stats-to-mapping: command not found". - Issue title and lead paragraph now state the drifted file count (`stats-to-mapping drift in N file(s) (date)`) so a reader scanning the issue list grasps the scope without opening it. Workflow output renamed `drift` -> `drifted_files` (a count) and the conditional steps gate on != '0'. No behavior change in what's detected or when an issue is opened. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
1 parent 70ab369 commit 37f46f8

1 file changed

Lines changed: 63 additions & 45 deletions

File tree

.github/workflows/stats-to-mapping-drift.yml

Lines changed: 63 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -29,11 +29,12 @@ jobs:
2929
env:
3030
DRIFT_LABEL: stats-to-mapping-drift
3131
WORK: ${{ github.workspace }}/.drift
32+
STATS_PORT: 15066
3233
# Single source of truth for the files this job tracks. Each line is
3334
# "<repo>/<path-within-repo>". The first slash-separated segment is
3435
# treated as the github.com/elastic/<repo> name; the remainder is the
3536
# file's path inside that repo. Sparse-checkout, the regen invocation,
36-
# the drift loop, and the issue-body recipe are all derived from this.
37+
# the drift loop, and the issue-body recipe all derive from this.
3738
UPSTREAM_FILES: |
3839
elasticsearch/x-pack/plugin/core/template-resources/src/main/resources/monitoring-beats.json
3940
elasticsearch/x-pack/plugin/core/template-resources/src/main/resources/monitoring-beats-mb.json
@@ -55,11 +56,14 @@ jobs:
5556
- name: Install stats-to-mapping
5657
run: go install github.com/elastic/apm-tools/cmd/stats-to-mapping@latest
5758

58-
- name: Run apm-server, capture /stats
59+
# Write the apm-server config to a file. The same file is read back later
60+
# to embed in the drift issue's reproduction recipe (with path.data
61+
# rewritten), so the two can never drift.
62+
- name: Write apm-server config
5963
run: |
6064
set -euo pipefail
6165
mkdir -p "$WORK/data"
62-
install -m 0600 /dev/stdin "$WORK/apm-server.yml" <<'EOF'
66+
install -m 0600 /dev/stdin "$WORK/apm-server.yml" <<EOF
6367
apm-server:
6468
host: "127.0.0.1:18200"
6569
sampling.tail:
@@ -69,27 +73,30 @@ jobs:
6973
- sample_rate: 1
7074
output.elasticsearch:
7175
hosts: ["http://127.0.0.1:9200"]
72-
path.data: ${WORK}/data
76+
path.data: $WORK/data
7377
http.enabled: true
7478
http.host: "127.0.0.1"
75-
http.port: 15066
79+
http.port: $STATS_PORT
7680
logging.level: warning
7781
EOF
78-
# Substitute $WORK because the heredoc was quoted.
79-
sed -i "s|\${WORK}|$WORK|" "$WORK/apm-server.yml"
80-
"$WORK/apm-server" --strict.perms=false -e -c "$WORK/apm-server.yml" >"$WORK/apm-server.log" 2>&1 &
81-
echo $! > "$WORK/apm-server.pid"
82+
83+
- name: Run apm-server, capture /stats
84+
run: |
85+
set -euo pipefail
86+
"$WORK/apm-server" --strict.perms=false -e -c "$WORK/apm-server.yml" \
87+
>"$WORK/apm-server.log" 2>&1 &
88+
pid=$!
89+
trap 'kill "$pid" 2>/dev/null; wait "$pid" 2>/dev/null || true' EXIT
8290
for attempt in $(seq 1 30); do
83-
if curl -sf http://127.0.0.1:15066/stats -o "$WORK/stats.json"; then
91+
if curl -sf "http://127.0.0.1:$STATS_PORT/stats" -o "$WORK/stats.json"; then
8492
echo "Captured /stats on attempt $attempt"
8593
break
8694
fi
8795
sleep 1
8896
done
89-
kill "$(cat "$WORK/apm-server.pid")" || true
90-
wait || true
9197
test -s "$WORK/stats.json"
92-
# Sanity: TBS-enabled stats should contain sampling.tail.storage.
98+
# TBS-enabled stats must include sampling.tail.storage.* — fail fast
99+
# if the config or apm-server's stats shape changed.
93100
jq -e '.["apm-server"].sampling.tail.storage' "$WORK/stats.json" >/dev/null
94101
95102
- name: Sparse-checkout upstream mapping files
@@ -124,7 +131,7 @@ jobs:
124131
set -euo pipefail
125132
: > "$WORK/drift.diff"
126133
: > "$WORK/drift.summary.md"
127-
drift_found=false
134+
drifted_files=0
128135
while IFS= read -r path; do
129136
[ -z "$path" ] && continue
130137
repo=${path%%/*}
@@ -133,7 +140,7 @@ jobs:
133140
if git -C "$tree" diff --quiet -- "$file"; then
134141
continue
135142
fi
136-
drift_found=true
143+
drifted_files=$((drifted_files + 1))
137144
read -r added removed _ < <(git -C "$tree" diff --numstat -- "$file")
138145
printf -- '- `%s/%s` — +%s / -%s\n' "$repo" "$file" "$added" "$removed" \
139146
>> "$WORK/drift.summary.md"
@@ -143,10 +150,10 @@ jobs:
143150
printf '\n```\n'
144151
} >> "$WORK/drift.diff"
145152
done <<< "$UPSTREAM_FILES"
146-
echo "drift=$drift_found" >> "$GITHUB_OUTPUT"
153+
echo "drifted_files=$drifted_files" >> "$GITHUB_OUTPUT"
147154
148155
- name: Upload drift diff artifact
149-
if: steps.drift.outputs.drift == 'true'
156+
if: steps.drift.outputs.drifted_files != '0'
150157
uses: actions/upload-artifact@v4
151158
with:
152159
name: drift-diff
@@ -155,26 +162,45 @@ jobs:
155162
${{ env.WORK }}/stats.json
156163
retention-days: 30
157164

165+
# If drift is detected and no open drift issue already exists, file a new
166+
# issue. The issue body has three sections: a one-line summary, a list of
167+
# drifted files, and a copy-pasteable local reproduction recipe.
168+
#
169+
# The apm-server config in the recipe is read back from the file written
170+
# earlier in this job (with the CI-specific path.data rewritten) so the
171+
# recipe and the CI run can never disagree about what config produced
172+
# the captured /stats.
158173
- name: Open drift issue if none exists
159-
if: steps.drift.outputs.drift == 'true'
174+
if: steps.drift.outputs.drifted_files != '0'
160175
env:
161176
GH_TOKEN: ${{ github.token }}
162177
REPO: ${{ github.repository }}
163178
RUN_URL: ${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}
179+
DRIFTED_FILES: ${{ steps.drift.outputs.drifted_files }}
164180
run: |
165181
set -euo pipefail
166-
existing=$(gh issue list --repo "$REPO" --label "$DRIFT_LABEL" --state open --json number --jq 'length')
182+
existing=$(gh issue list --repo "$REPO" --label "$DRIFT_LABEL" \
183+
--state open --json number --jq 'length')
167184
if [ "$existing" -gt 0 ]; then
168185
echo "Open drift issue already exists; not opening another"
169186
exit 0
170187
fi
188+
171189
today=$(date -u +%Y-%m-%d)
190+
191+
# Build the path block for the recipe by prefixing each tracked file
192+
# with /tmp/. Trailing backslashes continue the shell command.
172193
recipe_paths=$(printf '%s\n' "$UPSTREAM_FILES" \
173-
| awk 'NF > 0 { print " /tmp/" $0 " \\" }')
194+
| awk 'NF > 0 { print " /tmp/" $0 " \\" }')
195+
196+
# Reuse the exact apm-server config from the CI run, with path.data
197+
# rewritten to the local /tmp location used in the recipe.
198+
recipe_config=$(sed "s|$WORK/data|/tmp/apm-server-data|" "$WORK/apm-server.yml")
199+
174200
BODY=$WORK/issue-body.md
175201
{
176202
cat <<HEADER
177-
The weekly drift-check workflow detected divergence between apm-server's \`/stats\` endpoint and the upstream mapping files maintained in \`elastic/elasticsearch\`, \`elastic/beats\`, and \`elastic/integrations\`.
203+
The weekly drift-check workflow detected drift in **$DRIFTED_FILES file(s)** between apm-server's \`/stats\` endpoint and the upstream mapping files in \`elastic/elasticsearch\`, \`elastic/beats\`, and \`elastic/integrations\`.
178204
179205
Detected on $today by [workflow run]($RUN_URL). The full \`git diff\` of every drifted file is attached to that run as the \`drift-diff\` artifact, alongside the captured \`stats.json\`.
180206
@@ -186,56 +212,48 @@ jobs:
186212
187213
## Reproduce locally
188214
215+
From an apm-server checkout:
216+
189217
\`\`\`shell
190-
# 1. Install the regen tool.
218+
# 1. Install the regen tool. Make sure \$(go env GOPATH)/bin is on PATH.
191219
go install github.com/elastic/apm-tools/cmd/stats-to-mapping@latest
220+
export PATH="\$PATH:\$(go env GOPATH)/bin"
192221
193-
# 2. Build apm-server from main and capture a TBS-enabled /stats snapshot.
222+
# 2. Build apm-server and capture a TBS-enabled /stats snapshot.
194223
go build -o /tmp/apm-server ./x-pack/apm-server
195224
install -m 0600 /dev/stdin /tmp/apm-server.yml <<'CFG'
196-
apm-server:
197-
host: "127.0.0.1:18200"
198-
sampling.tail:
199-
enabled: true
200-
interval: 1m
201-
policies:
202-
- sample_rate: 1
203-
output.elasticsearch:
204-
hosts: ["http://127.0.0.1:9200"]
205-
path.data: /tmp/apm-server-data
206-
http.enabled: true
207-
http.host: "127.0.0.1"
208-
http.port: 15066
209-
logging.level: warning
225+
$recipe_config
210226
CFG
211227
/tmp/apm-server --strict.perms=false -e -c /tmp/apm-server.yml &
228+
pid=\$!
212229
sleep 6
213-
curl -s http://127.0.0.1:15066/stats > /tmp/stats.json
214-
kill %1; wait || true
230+
curl -s http://127.0.0.1:$STATS_PORT/stats > /tmp/stats.json
231+
kill "\$pid"; wait "\$pid" 2>/dev/null || true
215232
216-
# 3. Clone the three upstream repos (sparse).
233+
# 3. Clone the three upstream repos.
217234
for r in elasticsearch beats integrations; do
218235
git clone --depth 1 --filter=blob:none "https://github.com/elastic/\$r" "/tmp/\$r"
219236
done
220237
221-
# 4. Run the regen.
238+
# 4. Regenerate the five mapping files in place.
222239
stats-to-mapping \\
223240
$recipe_paths
224241
< /tmp/stats.json
225242
226-
# 5. Inspect the diffs and prepare upstream PRs.
243+
# 5. Inspect the resulting diffs and prepare upstream PRs.
227244
for r in elasticsearch beats integrations; do
228-
echo "=== \$r ==="; git -C "/tmp/\$r" diff
245+
echo "=== \$r ==="
246+
git -C "/tmp/\$r" diff
229247
done
230248
\`\`\`
231249
232250
## Resolution
233251
234-
Open a PR in each upstream repo with the diffs above, then close this issue. The workflow will not open another issue while this one stays open.
252+
Open a PR in each upstream repo with the diffs above, then close this issue. The workflow will not open another drift issue while this one stays open.
235253
RECIPE
236254
} > "$BODY"
237255
gh issue create --repo "$REPO" \
238-
--title "Monitoring metric mappings drifted from /stats ($today)" \
256+
--title "stats-to-mapping drift in $DRIFTED_FILES file(s) ($today)" \
239257
--label bug \
240258
--label "$DRIFT_LABEL" \
241259
--body-file "$BODY"

0 commit comments

Comments
 (0)