Skip to content

Add LinearBinarySearch{MAX}: bounded linear walk + binary fallback#72

Draft
ChrisRackauckas-Claude wants to merge 1 commit into
SciML:mainfrom
ChrisRackauckas-Claude:linearbinary-search
Draft

Add LinearBinarySearch{MAX}: bounded linear walk + binary fallback#72
ChrisRackauckas-Claude wants to merge 1 commit into
SciML:mainfrom
ChrisRackauckas-Claude:linearbinary-search

Conversation

@ChrisRackauckas-Claude
Copy link
Copy Markdown
Contributor

Summary

Adds LinearBinarySearch{MAX} <: SearchStrategy to FFF: walk linearly from the hint for up to MAX steps, fall back to a binary search if the answer isn't bracketed within the window. Mirrors the strategy of the same name in FastInterpolations.jl; FFF previously had ExpFromLeft (forward exponential doubling) and BracketGallop (bidirectional doubling) but no bounded-linear-with-binary-fallback option.

The intended workload is small-gap ODE-style monotone-forward sweeps where the exponential-doubling overhead of ExpFromLeft/BracketGallop is pure cost and a tight unrolled linear walk is fastest.

Design choices

  • Default MAX = 8, matching FastInterpolations.jl. Allowed values are {0, 1, 2, 4, 8, 16, 32, 64, 128} via a factory constructor — curated to keep the per-MAX method specialization table bounded. Arbitrary integers via the parametric form LinearBinarySearch{k}() still work but won't go through validation.
  • MAX is a type parameter, so the walk is fully unrolled. For MAX ≤ 16 an @generated function produces flat branchless-ish code; for MAX > 16 the walk falls back to a bounded while-loop (unrolling at 128 would balloon the code size).
  • Order-aware — uses Base.Order.lt on the predicate, so Forward and Reverse orderings share one code path.
  • No-hint and out-of-range hint fall through to BinaryBracket.
  • MAX covers gaps 0..MAX (initial gap-0 check plus MAX advance-and-check pairs).

Bench (bench/linearbinary_sweep.jl, n = 100k Float64, ns/query)

gap LBS{4} LinearScan ExpFromLeft BracketGallop
1 9.6 10.0 11.7 17.0
2 11.0 11.0 12.8 20.5
4 14.0 14.0 22.0 23.5
8 43.0 19.0 22.7 25.0 ← MAX exceeded → fallback
16 42.0 30.0 26.0 29.5
64 42.0 101.0 33.0 39.5

LinearBinarySearch{4} wins by ~0.5–1 ns/q at gap = 1 and ties LinearScan at gap = 2. At gaps beyond MAX, the binary fallback caps the worst case at O(log n) — slightly worse than BracketGallop at large gaps but bounded.

Auto integration decision: opt-in only

The gap=1 win is too marginal (~1 ns/q) for a runtime Auto heuristic to recoup — the per-call branch to choose LinearBinarySearch over LinearScan would itself cost the same. This matches FFF's existing pattern for BitInterpolationSearch: keep the strategy as opt-in for callers with workloads they've measured. Auto's per-query tree is unchanged.

Test plan

  • Construction & dispatch on all allowed MAX values, including ArgumentError on bad values
  • Parity vs Base.searchsortedlast/first fuzz on Float64 + Int64 across MAX ∈ {0, 1, 2, 4, 8, 16, 32, 64, 128}
  • Reverse-order parity fuzz
  • Hint past / below / at the answer (walks correct direction)
  • Gap = MAX vs gap = MAX + 1 (binary fallback boundary)
  • Edge cases: empty vector, n=1, n=1M, hint at firstindex/lastindex, hint == answer, out-of-range hint, no-hint dispatch
  • Duplicates (verifies searchsortedfirst finds the first occurrence after a backward walk)
  • All ~36k new tests pass; full suite passes (136766 passes total)
  • Runic-clean

Please ignore until reviewed by @ChrisRackauckas.

🤖 Generated with Claude Code

LinearBinarySearch{MAX} walks linearly from the hint for up to MAX steps,
then falls back to BinaryBracket if the answer isn't bracketed within the
window. The win regime is small-gap ODE-style monotone-forward workloads
where the exponential-doubling overhead of ExpFromLeft / BracketGallop is
pure cost and a tight unrolled linear walk is fastest.

Default MAX = 8; allowed values are {0, 1, 2, 4, 8, 16, 32, 64, 128} via a
factory constructor. The set is curated to keep the per-MAX method
specialization table bounded; arbitrary integers would explode it. MAX is
a type parameter so the walk is fully unrolled (for MAX ≤ 16) via
@generated, producing flat branchless-ish code that LLVM can fold.

Bench (bench/linearbinary_sweep.jl, n = 100k Float64, ns/query):

  gap   LBS{4}   LinearScan   ExpFromLeft   BracketGallop
  1      9.6     10.0         11.7          17.0
  2     11.0    11.0          12.8          20.5
  4     14.0    14.0          22.0          23.5
  8     43.0    19.0          22.7          25.0  ← MAX exceeded → fallback
  16    42.0    30.0          26.0          29.5
  64    42.0   101.0          33.0          39.5

LBS{4} wins by ~0.5–1 ns/q at gap = 1; ties LinearScan at gap = 2. At
gaps beyond MAX, the binary fallback caps the worst case at O(log n).
The strategy is opt-in — Auto does not pick it, because the gap=1 win is
too marginal for a runtime heuristic to recoup.

Includes tests covering: factory constructor validation, parity-vs-Base
across MAX ∈ {0, 1, 2, 4, 8, 16, 32, 64, 128}, Forward and Reverse
orderings, hint past / below / at the answer, gap = MAX vs gap = MAX+1
boundary, empty / n=1 / n=1M, duplicates, and no-hint fallback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants