Skip to content

feat: fetch multiple chunks in parallel when reading from s3#617

Merged
floriankrb merged 4 commits into
ecmwf:mainfrom
ronandarcy:feat/s3-optimisation
May 6, 2026
Merged

feat: fetch multiple chunks in parallel when reading from s3#617
floriankrb merged 4 commits into
ecmwf:mainfrom
ronandarcy:feat/s3-optimisation

Conversation

@ronandarcy
Copy link
Copy Markdown
Contributor

overrides Zarr's default sequential getitems() for significantly higher throughput when reading from an S3 bucket

Description

A companion change to ecmwf/anemoi-utils#289 to make use of the get_objects_parallel

What problem does this change solve?

Significantly increases training speed when reading from an S3 bucket.
When training using the O48 dataset hosted at EWC, a speedup from 0.31 it/s to 0.79 it/s was seen representing an increase of over 150%.

What issue or task does this change relate to?

Additional notes

ecmwf/anemoi-utils#289 is required for this change to work.

As a contributor to the Anemoi framework, please ensure that your changes include unit tests, updates to any affected dependencies and documentation, and have been tested in a parallel setting (i.e., with multiple GPUs). As a reviewer, you are also responsible for verifying these aspects and requesting changes if they are not adequately addressed. For guidelines about those please refer to https://anemoi.readthedocs.io/en/latest/

By opening this pull request, I affirm that all authors agree to the Contributor License Agreement.

overrides Zarr's default sequential getitems() for significantly
higher throughput when reading from an S3 bucket
@github-project-automation github-project-automation Bot moved this to To be triaged in Anemoi-dev Apr 23, 2026
@github-actions github-actions Bot added contributor documentation Improvements or additions to documentation labels Apr 23, 2026
Copy link
Copy Markdown
Member

@HCookie HCookie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your interest in the project.
Looks good to me, just one comment on the changelog, as we don't manually set that.

Comment thread CHANGELOG.md
@github-project-automation github-project-automation Bot moved this from To be triaged to Under Review in Anemoi-dev Apr 23, 2026
@HCookie HCookie requested review from b8raoult and floriankrb May 5, 2026 17:42
@floriankrb floriankrb requested review from HCookie and removed request for HCookie and b8raoult May 6, 2026 07:08
@floriankrb floriankrb dismissed HCookie’s stale review May 6, 2026 07:18

Not idea why I cannot merge this, I don't see any review from HCookie, let's dismis the review ot try merging.

@floriankrb
Copy link
Copy Markdown
Member

@ronandarcy I updated your branch, approved, and merged this. Some contributors may not like when somebody else is updating their branch because it's a commit in their own repo. If this is a problem for you, don't hesitate to say so.

@floriankrb floriankrb merged commit 88e8c08 into ecmwf:main May 6, 2026
12 checks passed
@github-project-automation github-project-automation Bot moved this from Under Review to Done in Anemoi-dev May 6, 2026
@ronandarcy
Copy link
Copy Markdown
Contributor Author

@ronandarcy I updated your branch, approved, and merged this. Some contributors may not like when somebody else is updating their branch because it's a commit in their own repo. If this is a problem for you, don't hesitate to say so.

No problem at all, thanks for merging

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ATS Approval not needed contributor documentation Improvements or additions to documentation

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants