perf(arrow-ord): Avoid full index materialization for small-limit lexsorts#9991
perf(arrow-ord): Avoid full index materialization for small-limit lexsorts#9991pchintar wants to merge 2 commits into
Conversation
b3b7fe3 to
eb9a67b
Compare
eb9a67b to
c3fdb35
Compare
|
@alamb & @etseidl I investigated this integration failure locally since the failure appeared unrelated to the changes in my PR. My PR only modifies
I also checked the local integration corpus and could not find a I also noticed the same |
|
run benchmarks sort_kernel |
I think it was an error in .NET: |
|
🤖 Arrow criterion benchmark running (GKE) | trigger CPU Details (lscpu)Comparing lexsort-small-limit-topk (b7bdb86) to 4b80f0e (merge-base) diff File an issue against this benchmark runner |
|
🤖 Arrow criterion benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagebase (merge-base)
branch
File an issue against this benchmark runner |
Which issue does this PR close?
Rationale for this change
In
arrow-ord/src/sort.rs,lexsort_to_indicescurrently materializes row indices for the full input before applying the requested lexsort limit.For example, with:
the current implementation still allocates and initializes indices for all 4096 rows:
even though only the first 10 sorted indices are returned.
This PR reduces allocation and sorting work for small-limit lexsorts by avoiding full index materialization when the requested limit is a small fraction(limit < 1/10 th of row count) of the input size.
What changes are included in this PR?
This PR adds a bounded top-k heap path for small-limit lexsorts in
arrow-ord/src/sort.rs.The new path:
The existing partial-sort implementation remains unchanged for larger limits.
To support the bounded heap implementation, this PR also adds:
lexsort_topk_fixedlexsort_topksift_up_worst_heapsift_down_worst_heapAre these changes tested?
Yes.
Existing tests pass:
My local Benchmark results from:
show improvements for small-limit lexsorts such as limits 10, 100.
Are there any user-facing changes?
No.