Skip to content

Expand distributed indexing, match numpy indexing scheme#938

Open
ClaudiaComito wants to merge 231 commits into
mainfrom
914_adv-indexing-outshape-outsplit
Open

Expand distributed indexing, match numpy indexing scheme#938
ClaudiaComito wants to merge 231 commits into
mainfrom
914_adv-indexing-outshape-outsplit

Conversation

@ClaudiaComito
Copy link
Copy Markdown
Member

@ClaudiaComito ClaudiaComito commented Mar 24, 2022

Description

This pull request introduces a significant overhaul of distributed indexing within dndarray.py, specifically targeting the __getitem__ and __setitem__ methods. The primary objective is to achieve full NumPy indexing compliance in a distributed environment while minimizing MPI overhead and memory footprint.

The logic has been refactored to identify zero-communication paths ("early out"), and route heavy unordered advanced indexing through (hopefully?) optimized communication.

The following table shows the distribution semantics of the DNDarray indexing operations. The first column shows the operation, the second column shows the distribution semantics of the key, and the third column shows the distribution semantics of the value. The last column shows the distribution semantics of the result.

Array distributed Operation Key distributed Value distributed Result distributed
no array[key] no -- no
no array[key] yes -- yes
yes array[key] no -- no
yes array[key] yes -- yes
no array[key] = value no no no
no array[key] = value no yes (but why?) no
no array[key] = value yes (but why?) no no
no array[key] = value yes yes no
yes array[key] = value no no yes
yes array[key] = value no yes yes
yes array[key] = value yes no yes
yes array[key] = value yes yes yes

Routing logic

The flowchart (DRAFT) maps out the MPI routing decisions based on the evaluated state of the indexing key.

graph TD
    classDef default fill:#ffffff,stroke:#ced4da,stroke-width:1px,color:#212529,rx:4px,ry:4px;
    classDef terminal fill:#343a40,stroke:#343a40,stroke-width:2px,color:#ffffff,rx:15px,ry:15px;
    classDef decision fill:#e3f2fd,stroke:#4dabf7,stroke-width:2px,color:#000000;
    classDef highlight fill:#e8f5e9,stroke:#69b3a2,stroke-width:2px,color:#212529;

    Start(["Start: arr[key] or arr[key] = value"]):::terminal
    Norm["Normalize Key (e.g., Bool Masks -> Int Indices)"]
    ProcessKey["__process_key() Expands dims, aligns shapes"]
    StateCalc{"Calculate State: split_key_is_ordered"}:::decision
    
    Start --> Norm
    Norm --> ProcessKey
    ProcessKey --> StateCalc
    
    Branch1{"Single item on split axis? (root != None)"}:::decision
    Branch0{"Operation?"}:::decision
    BranchNeg1{"Operation?"}:::decision
    
    StateCalc -- "1: Ordered / Ascending" --> Branch1
    StateCalc -- "0: Unordered / Random" --> Branch0
    StateCalc -- "-1: Descending Slice" --> BranchNeg1
    
    subgraph Ordered ["Fast Path: Ordered Indexing (split_key_is_ordered = 1)"]
        style Ordered fill:#f8f9fa,stroke:#dee2e6,stroke-width:1px,stroke-dasharray: 5 5,color:#495057
        Branch1 -- "Yes: Get" --> RootGet["Root fetches local data"]
        RootGet --> Bcast["MPI.Bcast to all ranks"]
        
        Branch1 -- "Yes: Set" --> RootSet["Root updates local data in-place"]
        
        Branch1 -- "No" --> FastLocal["Pure Basic Slicing: Apply locally, NO MPI needed"]:::highlight
    end
    
    subgraph Descending ["Descending Slices (split_key_is_ordered = -1)"]
        style Descending fill:#f8f9fa,stroke:#dee2e6,stroke-width:1px,stroke-dasharray: 5 5,color:#495057
        BranchNeg1 -- "Set" --> FlipVal["Flip 'value' array"]
        FlipVal --> MatchDist["Align distribution map"]
        MatchDist --> SetLocal["-1 Local Set"]
        
        BranchNeg1 -- "Get" --> UnorderedFallback["Converts to arange -> falls back to unordered"]
    end
    
    subgraph Unordered ["Heavy Path: Unordered Advanced Indexing (split_key_is_ordered = 0)"]
        style Unordered fill:#f8f9fa,stroke:#dee2e6,stroke-width:1px,stroke-dasharray: 5 5,color:#495057
        
        Branch0 -- "Get" --> G_Allgather["MPI.Allgather: Share recv_counts"]
        G_Allgather --> G_SendIdx["MPI.Isend/Recv: Send requested indices to owning ranks"]
        G_SendIdx --> G_Fetch["Owning ranks fetch local data"]
        G_Fetch --> G_SendData["MPI.Isend/Recv: Send requested data back"]
        G_SendData --> G_Reconstruct["Reconstruct recv_buf on original rank"]
        
        Branch0 -- "Set" --> S_CheckVal{"Is 'value' distributed?"}:::decision
        S_CheckVal -- "No / Scalar" --> S_LocalMask["_advanced_setitem_unordered_local (Apply locally)"]
        
        S_CheckVal -- "Yes" --> S_Align["Redistribute 'value' to match 'key' distribution"]
        S_Align --> S_AllToAll["MPI.Alltoallv: Exchange data AND indices"]
        S_AllToAll --> S_ApplyRecv["Apply received data to local elements"]
    end

    Bcast --> End(["Return / Complete"]):::terminal
    RootSet --> End
    FastLocal --> End
    SetLocal --> End
    UnorderedFallback -.-> Branch0
    G_Reconstruct --> End
    S_LocalMask --> End
    S_ApplyRecv --> End
Loading

Main changes

  • abstracts key parsing and alignment into a centralized private method that handles dimension expansion, shape broadcasting, and classifies the state of the indexing operation to determine network routing.
  • enforces standard last-assignment-wins semantics for advanced indexing duplicates on cuda tensors by generating linear indices and mapping local occurrence priorities (thanks @Hakdag97 ).
  • intercepts multidimensional and single-dimensional boolean masks early in the pipeline, converting them to explicit integer configurations locally to prevent unnecessary cross-rank broadcasting.
  • maps and isolates zero-communication assignments during slice operations, executing completely local pytorch tensor modifications when the requested indices and data already reside on the active rank.
  • structures unordered read requests by compiling global communication matrices, enabling the dispatch of non-blocking Isend and Recv calls strictly between nodes that own the requested indices and those requesting them.
  • forces distribution alignment during set operations if the right-hand side assignment value is also distributed, utilizing an Alltoallv operation to shuffle payload data and target indices concurrently.
  • introduces a value broadcasting helper function to natively squeeze or expand the dimensions of scalar or tensor payloads to match the specific dimensional footprint of the target slice before assignment occurs.

To Be Continued...

Memory footprint

Scaling behaviour

Issue/s resolved: #914 #918

Changes proposed:

  • feature extension in __process_key, getitem, and setitem methods
  • edge case handling
  • extensive comparison to numpy API in unittests

Type of change

Memory requirements

Performance

Due Diligence

  • All split configurations tested
  • Multiple dtypes tested in relevant functions
  • Documentation updated (if needed)
  • Updated changelog.md under the title "Pending Additions"

Does this change modify the behaviour of other functions? If so, which?

yes / no

skip ci

Mystic-Slice and others added 10 commits April 8, 2022 10:56
…ays (#937)

* Create ci.yaml

* Update ci.yaml

* Update ci.yaml

* Create CITATION.cff

* Update CITATION.cff

* Update ci.yaml

different python and pytorch versions

* Update ci.yaml

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Delete pre-commit.yml

* Update ci.yaml

* Update CITATION.cff

* Update tutorial.ipynb

delete example with different split axis

* Delete logo_heAT.pdf

Removal of old logo

* ht.nonzero() returns tuple of 1-D arrays instead of n-D arrays

* Updated documentation and Unit-tests

* replace x.larray with local_x

* Code fixes

* Fix return type of nonzero function and gout value

* Made sure DNDarray meta-data is available to the tuple members

* Transpose before if-branching + adjustments to accomodate it

* Fixed global shape assignment

* Updated changelog

Co-authored-by: mtar <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Daniel Coquelin <[email protected]>
Co-authored-by: Markus Goetz <[email protected]>
Co-authored-by: Claudia Comito <[email protected]>
@ClaudiaComito
Copy link
Copy Markdown
Member Author

  • pass key to process_key immediately, separate by view/copy output

@JuanPedroGHM
Copy link
Copy Markdown
Member

Create different functions for scalar:

  • Scalar
  • Mask
  • Slice
  • Advanced indexing
  • Distributed key

Process key should return the tuple with the key information, and send to the separate functions.

Additionally, unnecessary function definitions in __getitem/__setitem.

Make table and diagram available in a markdown file in the docs folder for now.

Copy link
Copy Markdown
Collaborator

@brownbaerchen brownbaerchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked at the actual advanced indexing stuff. But I think we would do well to clean up a bit before getting into the details of this. There have been some changes mixed in that, to me, seem unrelated to advanced indexing in heat. They should be moved to separate PRs if we want to keep them. Other changes are cosmetic or temporary and should be removed entirely.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this change from this PR?

Comment thread .talismanrc
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this change from this PR (see #2216 for a similar case)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about these changes, but some them can be removed from this PR, right?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this change from this PR.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove these changes from this PR.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes seem unrelated to advanced indexing. Maybe put in a separate PR.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed here?

Comment thread heat/cluster/kmedoids.py
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this from the PR

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unrelated to advanced indexing. Maybe move to a separate PR?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unrelated to advanced indexing. Maybe move to separate PR?

@github-project-automation github-project-automation Bot moved this from Merge queue to In Progress in Roadmap Apr 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core enhancement New feature or request indexing linalg testing Implementation of tests, or test-related issues

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

Address distributed non-ordered indexing indexing vectors with index arrays

6 participants