You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add experimental advise_will_need for page cache prefetching (#131)
Adds an `experimental_advise_will_need()` method that computes the
coalesced byte ranges via `ParquetFileReader::GetReadRanges` and calls
`posix_fadvise(WILLNEED)` to trigger kernel readahead into the page
cache.
Arrow's `pre_buffer=True` dispatches reads to a shared IO thread pool
(`ReadAsync`), allocating large buffers that are not CPU-cache friendly.
This causes LLC misses when a different thread later decodes from that
buffer. The new method lets users warm the page cache before calling
`read_into_numpy` with `pre_buffer=False`. Each worker thread then
performs its own `pread` and decoding, keeping allocations small and
CPU-cache-friendly.
FYI, using an experimental prefix, as it is an experimental feature that
is not guaranteed to become a part of the stable API.
0 commit comments