ct/reconciler: parallelize LRO updates#30090
Conversation
Each LRO update is an independent raft replicate against a different partition's ctp_stm, but they were issued sequentially. In a potato topic saturation workload, waiting for the sequential LRO updates occupied up to 50% of each cycle. Advance LROs with bounded parallelism as a mitigation. In the same test, this increased reconciliation throughput by 30%.
|
If we backport potato topics this is 100% worth backporting. Are we going to backport a new feature? Probably not. Still worth backporting? It's a small improvement, why not. |
|
This Claude summary of the benchrunner comparison I did is also nice: |
There was a problem hiding this comment.
Pull request overview
This PR improves cloud-topics reconciliation throughput by parallelizing LRO (last reconciled offset) updates after successfully committing newly built L1 objects to the metastore. This reduces time spent waiting on sequential per-partition raft-replicated STM updates.
Changes:
- Collect all per-source
commit_infoentries across committed objects into a single list. - Update LROs using
ss::max_concurrent_for_eachwith a fixed concurrency cap (32) instead of issuing updates sequentially.
|
Nice!
Is this to say that the LRO update was the long pole even with metastore latencies in seconds? Or did your reporting in Slack about the metastore latency include this fix already? |
This was the worst thing. Even with the 30% speedup RC still can't keep pace with ingest, so still need to address metastore stuff. |
|
/backport v26.1.x |
Each LRO update is an independent raft replicate against a different partition's ctp_stm, but they were issued sequentially. In a potato topic saturation workload, waiting for the sequential LRO updates occupied up to 50% of each cycle.
Advance LROs with bounded parallelism as a mitigation. In the same test, this increased reconciliation throughput by 30%.
Backports Required
Release Notes