Skip to content

Add maxBytesPerRequest to NimbleIndexProjector#644

Open
tanjialiang wants to merge 2 commits intofacebookincubator:mainfrom
tanjialiang:export-D99391142
Open

Add maxBytesPerRequest to NimbleIndexProjector#644
tanjialiang wants to merge 2 commits intofacebookincubator:mainfrom
tanjialiang:export-D99391142

Conversation

@tanjialiang
Copy link
Copy Markdown
Contributor

Summary:
Add a maxBytesPerRequest option to NimbleIndexProjector that limits
the total bytes of serialized chunk data per lookup, complementing the
existing maxRowsPerRequest row-based limit.

Since byte sizes are only known after stripe serialization, byte-based
truncation operates at stripe granularity: at least one stripe is always
included (guaranteeing forward progress), then subsequent stripes are
skipped if the byte budget is exceeded. A resume key is set to enable
pagination.

When both maxRowsPerRequest and maxBytesPerRequest are set, both
limits apply independently — a lookup is saturated when either is
reached. Row-based resume keys take priority over byte-based ones since
they can point mid-stripe (more precise).

Changes:

  • Add maxBytesPerRequest to Options (0 = no limit).
  • Add bytesPerRequest_ member to track bytes per lookup across stripes.
  • Add stripeEndResumeKey to RequestRange, precomputed in
    lookupRowRanges() for use by byte-budget truncation in
    buildStripeResult() (where the index reader is no longer available).
  • In project(), filter saturated requests by both row and byte budgets.
  • In buildStripeResult(), track bytes and set byte-based resume keys.

Differential Revision: D99391142

Summary:
When `maxRowsPerLookup` truncates a lookup result, the caller previously had no way to know where to continue reading. This diff populates the `LookupResult::resumeKey` field to enable pagination of large result sets.

Changes:
- Add `ClusterIndexGroup::lookupChunkByRow()` to find a chunk by row position (reverse of `lookupChunk` which finds by key). Binary search on cumulative `chunk_rows`.
- Add `ClusterIndexReader::keyAtRow()` to read the encoded key at a given row position using the encoding API (reset + skip + materialize).
- In `NimbleIndexProjector::lookupRowRanges()`, when truncation occurs, call `keyAtRow()` on the first unread row to populate `resumeKey` with `{lowerKey=keyAtRow(endRow), upperKey=original.upperKey}`.
- Add `resumeKeys_` member to track resume keys per request across stripes.

Differential Revision: D98702773
Summary:
Add a `maxBytesPerRequest` option to NimbleIndexProjector that limits
the total bytes of serialized chunk data per lookup, complementing the
existing `maxRowsPerRequest` row-based limit.

Since byte sizes are only known after stripe serialization, byte-based
truncation operates at stripe granularity: at least one stripe is always
included (guaranteeing forward progress), then subsequent stripes are
skipped if the byte budget is exceeded. A resume key is set to enable
pagination.

When both `maxRowsPerRequest` and `maxBytesPerRequest` are set, both
limits apply independently — a lookup is saturated when either is
reached. Row-based resume keys take priority over byte-based ones since
they can point mid-stripe (more precise).

Changes:
- Add `maxBytesPerRequest` to `Options` (0 = no limit).
- Add `bytesPerRequest_` member to track bytes per lookup across stripes.
- Add `stripeEndResumeKey` to `RequestRange`, precomputed in
  `lookupRowRanges()` for use by byte-budget truncation in
  `buildStripeResult()` (where the index reader is no longer available).
- In `project()`, filter saturated requests by both row and byte budgets.
- In `buildStripeResult()`, track bytes and set byte-based resume keys.

Differential Revision: D99391142
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 6, 2026
@meta-codesync
Copy link
Copy Markdown

meta-codesync bot commented Apr 6, 2026

@tanjialiang has exported this pull request. If you are a Meta employee, you can view the originating Diff in D99391142.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant