Initialise disk queue frame IDs from persisted state#50534
Initialise disk queue frame IDs from persisted state#50534belimawr wants to merge 6 commits intoelastic:mainfrom
Conversation
|
💚 CLA has been signed |
Fix a restart regression in disk queue where already-ACKed tail events could be replayed when multiple segment files existed. On startup, `state.dat` restores `queuePosition.frameIndex`, but in-memory frame counters were reinitialized to zero. This desynchronized persisted progress from runtime frame ID tracking: - read path used `segments.nextReadFrameID` - ACK path used `acks.nextFrameID` - persisted state tracked `queuePosition.frameIndex` After restart, segment-boundary ACK bookkeeping could run with incorrect frame IDs, producing inconsistent persisted position and causing the last event from the newest segment to be replayed on a subsequent restart. Initialize both runtime counters from persisted `frameIndex` during queue startup: - set `segments.nextReadFrameID = frameID(nextReadPosition.frameIndex)` - set `acks.nextFrameID = frameID(nextReadPosition.frameIndex)` This keeps read/ACK frame ID progression aligned with persisted state across restarts and prevents duplicate replay of already-ACKed events. Assisted-By: Codex 5.3
e8e12ee to
cfd18ab
Compare
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Fixes a diskqueue restart regression where already-ACKed tail events could be replayed when multiple segment files exist by aligning in-memory frame ID counters with persisted state.dat on startup.
Changes:
- Initialize
segments.nextReadFrameIDandacks.nextFrameIDfrom persistedqueuePosition.frameIndexduringNewQueue. - Add a regression test covering multi-run restart behavior and segment rollover conditions.
- Add a changelog fragment documenting the bug fix.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| libbeat/publisher/queue/diskqueue/queue.go | Initializes runtime frame ID counters from persisted state during startup. |
| libbeat/publisher/queue/diskqueue/core_loop.go | Removes now-incorrect comment about nextReadFrameID initialization behavior. |
| libbeat/publisher/queue/diskqueue/queue_test.go | Adds restart regression test and helpers creating multiple segments and asserting no replay. |
| changelog/fragments/1778107949-fix-disk-queue-initialisation.yaml | Adds changelog entry for the diskqueue initialization fix. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
🤖 GitHub commentsJust comment with:
|
TL;DRBuildkite failed in Remediation
Investigation detailsRoot Cause
In PR HEAD, the imports are currently ordered with local-module imports before testify:
Evidence
Local reproduction of formatter drift in this PR checkout produced: @@ -25,6 +25,9 @@ import (
+ "github.com/stretchr/testify/assert"
+ "github.com/stretchr/testify/require"
@@ -32,8 +35,6 @@ import (
- "github.com/stretchr/testify/assert"
- "github.com/stretchr/testify/require"Verification
Follow-upIf CI still fails after committing the updated Note 🔒 Integrity filter blocked 2 itemsThe following items were blocked because they don't meet the GitHub integrity level.
To allow these resources, lower tools:
github:
min-integrity: approved # merged | approved | unapproved | noneWhat is this? | From workflow: PR Buildkite Detective Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not. |
Co-authored-by: Copilot Autofix powered by AI <[email protected]>
…eats into fix-disk-queue-initialisation
|
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
|
ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (5)
💤 Files with no reviewable changes (1)
📝 WalkthroughWalkthroughThis PR fixes a bug where the disk queue replays the last ACKed event on startup when multiple segment files exist. The fix initializes frame IDs ( 🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Proposed commit message
Checklist
I have made corresponding changes to the documentationI have made corresponding change to the default configuration filesstresstest.shscript to run them under stress conditions and race detector to verify their stability../changelog/fragmentsusing the changelog tool.## Disruptive User ImpactHow to test this PR locally
Run the tests:
Or follow the instructions from the bug report below
Related issues
## Use cases## Screenshots## Logs