feat(optimizer): support index selection for seletive backfill#25207
Open
chenzl25 wants to merge 3 commits intodylan/support_pk_prefix_for_snapshot_backfillfrom
Open
Conversation
Contributor
Author
|
Warning This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
This stack of pull requests is managed by Graphite. Learn more about stacking. |
8 tasks
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a streaming-only optimizer rewrite to improve snapshot backfill performance and routing correctness by (1) selecting a lowest-cost covering index for backfill scans and (2) expanding eligible IN (...) predicates into a union of per-value scans.
Changes:
- Introduces
StreamingIndexSelectionRule(covering-index choice +IN→LogicalUnionof scans). - Threads snapshot backfill type into
logical_rewrite_for_streamviaRewriteStreamContext. - Extends planner tests and e2e backfill tests to validate union-shaped plans and covering-index usage.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/frontend/src/optimizer/rule/streaming_index_selection_rule.rs | New streaming backfill rewrite rule (covering index selection + IN expansion). |
| src/frontend/src/optimizer/rule/mod.rs | Wires the new rule into the optimizer rule module exports. |
| src/frontend/src/optimizer/rule/index_selection_rule.rs | Exposes cost-estimation helpers/types for reuse by the streaming rule. |
| src/frontend/src/optimizer/plan_node/logical_scan.rs | Applies streaming rewrite during logical_rewrite_for_stream for snapshot backfill; exposes clone_with_predicate. |
| src/frontend/src/optimizer/plan_node/convert.rs | Extends RewriteStreamContext with optional BackfillType. |
| src/frontend/src/optimizer/mod.rs | Passes backfill type into logical_rewrite_for_stream. |
| src/frontend/planner_test/tests/testdata/input/backfill.yaml | Adds planner test inputs for IN expansion and covering index backfill selection. |
| src/frontend/planner_test/tests/testdata/output/backfill.yaml | Adds expected stream plans (including StreamUnion) for new cases. |
| e2e_test/backfill/snapshot_backfill/pk_predicate_pushdown.slt | Adds e2e coverage for IN predicate behavior and covering index usage during snapshot backfill. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/frontend/src/optimizer/rule/streaming_index_selection_rule.rs
Outdated
Show resolved
Hide resolved
8 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's intention?
This PR implements streaming index selection optimizations for snapshot backfill operations. The changes introduce two key optimizations:
1. Covering Index Selection for Backfill
StreamingIndexSelectionRulethat selects the lowest-cost covering index during snapshot backfill2. IN Predicate Expansion
INpredicates (e.g.,WHERE a IN (1, 2, 3)) into aLogicalUnionof separateLogicalScannodesa = 1,a = 2,a = 3) for better scan range optimizationImplementation Details:
logical_rewrite_for_streamfor snapshot backfill operationsRewriteStreamContextto pass backfill type information through the rewrite processStreamUnionstructuresThe changes ensure that materialized view backfill operations can leverage indexes more effectively and handle IN predicates with optimal scan patterns.
Checklist
Documentation
Release note
Improved performance of materialized view backfill operations through automatic index selection and IN predicate optimization. When creating materialized views with filtered queries, RisingWave now automatically selects the most efficient covering index and optimizes IN predicates by splitting them into parallel scan operations. This results in faster backfill completion times, especially for queries with selective predicates on indexed columns.