Enable stencil ghost cell widening for sharded compilations#2732
Closed
gbaraldi wants to merge 7 commits intoEnzymeAD:mainfrom
Closed
Enable stencil ghost cell widening for sharded compilations#2732gbaraldi wants to merge 7 commits intoEnzymeAD:mainfrom
gbaraldi wants to merge 7 commits intoEnzymeAD:mainfrom
Conversation
Adds `stencil-ghost-cell-widening` to the compilation pipeline for sharded computations. Runs before optimization passes so stencil pads are eliminated before they get recognized as communication patterns (recognize_extend/rotate/wrap) or merged into MultiPadOp. Requires EnzymeAD/Enzyme-JAX#2326. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Runs as a function pass inside func.func(...) pipeline, before the transform-dialect patterns (which include multi-pad recognition). Only enabled when is_sharded=true. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…ring The stencil slice→pad patterns are cleaner after canonicalize/CSE and the pad optimization patterns have run. Moving it after the transform patterns but before lower_comms gives better detection. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The pass is a func::FuncOp pass, so it needs func.func() wrapper when used in a module-level pipeline. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Only keep the func.func()-wrapped version in func_passes. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The transform_passes include recognize_extend which converts slice→pad into enzymexla.extend before our pass could see them. Move the ghost cell widening to run after canonicalize/CSE but before the transform dialect patterns. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Pad optimization patterns (slice_pad etc.) fold our widened slices back into stencil pads if they run after us. By running last in func_passes (after all transform patterns and comm lowering), nothing undoes our work. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Enables the
stencil-ghost-cell-wideningpass in the compilation pipeline for sharded computations. The pass replaces per-operator halo exchanges with a single wide ghost cell exchange, following the overlapped tiling with redundant computation approach.Pipeline placement
Runs before the optimization passes (which include
recognize_extend,recognize_rotate,recognize_wrap, and pad merging patterns). This ensures stencil pads are eliminated before they get recognized as communication patterns or merged intoMultiPadOp.Only runs when
is_sharded=trueandoptimization_passes=:all.Dependencies
Requires EnzymeAD/Enzyme-JAX#2326 (the pass implementation).
Test plan
🤖 Generated with Claude Code