feat: fetch multiple chunks in parallel when reading from s3 by ronandarcy · Pull Request #617 · ecmwf/anemoi-datasets

ronandarcy · 2026-04-23T15:08:10Z

overrides Zarr's default sequential getitems() for significantly higher throughput when reading from an S3 bucket

Description

A companion change to ecmwf/anemoi-utils#289 to make use of the get_objects_parallel

What problem does this change solve?

Significantly increases training speed when reading from an S3 bucket.
When training using the O48 dataset hosted at EWC, a speedup from 0.31 it/s to 0.79 it/s was seen representing an increase of over 150%.

What issue or task does this change relate to?

Additional notes

ecmwf/anemoi-utils#289 is required for this change to work.

As a contributor to the Anemoi framework, please ensure that your changes include unit tests, updates to any affected dependencies and documentation, and have been tested in a parallel setting (i.e., with multiple GPUs). As a reviewer, you are also responsible for verifying these aspects and requesting changes if they are not adequately addressed. For guidelines about those please refer to https://anemoi.readthedocs.io/en/latest/

By opening this pull request, I affirm that all authors agree to the Contributor License Agreement.

overrides Zarr's default sequential getitems() for significantly higher throughput when reading from an S3 bucket

HCookie

Thanks for your interest in the project.
Looks good to me, just one comment on the changelog, as we don't manually set that.

Not idea why I cannot merge this, I don't see any review from HCookie, let's dismis the review ot try merging.

floriankrb · 2026-05-06T07:19:10Z

@ronandarcy I updated your branch, approved, and merged this. Some contributors may not like when somebody else is updating their branch because it's a commit in their own repo. If this is a problem for you, don't hesitate to say so.

ronandarcy · 2026-05-06T15:09:13Z

@ronandarcy I updated your branch, approved, and merged this. Some contributors may not like when somebody else is updating their branch because it's a commit in their own repo. If this is a problem for you, don't hesitate to say so.

No problem at all, thanks for merging

feat: fetch multiple chunks in parallel when reading from s3

fa0d8e4

overrides Zarr's default sequential getitems() for significantly higher throughput when reading from an S3 bucket

github-project-automation Bot added this to Anemoi-dev Apr 23, 2026

github-project-automation Bot moved this to To be triaged in Anemoi-dev Apr 23, 2026

github-actions Bot added contributor documentation Improvements or additions to documentation labels Apr 23, 2026

HCookie previously requested changes Apr 23, 2026

View reviewed changes

Comment thread CHANGELOG.md

github-project-automation Bot moved this from To be triaged to Under Review in Anemoi-dev Apr 23, 2026

HCookie assigned ronandarcy Apr 23, 2026

HCookie added the ATS Approval not needed label Apr 23, 2026

Rónán Darcy added 2 commits April 24, 2026 07:30

chore: revert manual CHANGELOG changes

54b687b

fix: revert to correct CHANGELOG.md

159dd5e

HCookie requested review from b8raoult and floriankrb May 5, 2026 17:42

Merge branch 'main' into feat/s3-optimisation

ca650ff

floriankrb approved these changes May 6, 2026

View reviewed changes

floriankrb requested review from HCookie and removed request for HCookie and b8raoult May 6, 2026 07:08

floriankrb merged commit 88e8c08 into ecmwf:main May 6, 2026
12 checks passed

github-project-automation Bot moved this from Under Review to Done in Anemoi-dev May 6, 2026

DeployDuck mentioned this pull request May 6, 2026

chore(main): Release 0.5.37 #624

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: fetch multiple chunks in parallel when reading from s3#617

feat: fetch multiple chunks in parallel when reading from s3#617
floriankrb merged 4 commits into
ecmwf:mainfrom
ronandarcy:feat/s3-optimisation

ronandarcy commented Apr 23, 2026

Uh oh!

HCookie left a comment

Uh oh!

Uh oh!

floriankrb commented May 6, 2026

Uh oh!

Uh oh!

ronandarcy commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ronandarcy commented Apr 23, 2026

Description

What problem does this change solve?

What issue or task does this change relate to?

Additional notes

Uh oh!

HCookie left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

floriankrb commented May 6, 2026

Uh oh!

Uh oh!

ronandarcy commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants