Add contrib intdiv: fast integer division by invariant scalars using multiplication#2875
Add contrib intdiv: fast integer division by invariant scalars using multiplication#2875abhishek-iitmadras wants to merge 1 commit intogoogle:masterfrom
Conversation
jan-wassenberg
left a comment
There was a problem hiding this comment.
A bunch of comments :) Some influence others, so please read them all before addressing any.
| * | ||
| * We split the work into two steps: | ||
| * 1) Precompute parameters from the scalar divisor (multiplier + shifts). | ||
| * DivisorParams{U,S}<T> ComputeDivisorParams(T divisor); |
There was a problem hiding this comment.
Can we reuse any of the existing logic from base.h Divisor[64]?
There was a problem hiding this comment.
I checked base.h Divisor / Divisor64 which solve different scalar-only problem using a simpler reciprocal-multiply scheme. We don't thinks , we can use logic here , i mean that this contrib/intdiv implements GM invariant-division algorithm, including the correction step, signed handling, and edge cases which in turn supports vector lane division and separate logic for signed vs unsigned and 8/16-bit widened multiply path and vector MulHigh and target-specific scalar fallback for 64-bit lane so might be direct reuse of existing logic is unlikely to fit here.
Correct me if i am wrong.
hwy/contrib/intdiv/intdiv-inl.h
Outdated
| }; | ||
|
|
||
| template <> | ||
| struct MulType<uint8_t> { |
There was a problem hiding this comment.
We can use existing functionality: base.h also has
template <>
struct Relations<uint8_t> {
using Unsigned = uint8_t;
using Signed = int8_t;
using Wide = uint16_t;
};
etc, so we could use MulType = Relations::Wide.
There was a problem hiding this comment.
I tried but it partially applicable but not a direct replacement as mapping diverge for 32-bit and 64-bit
hwy/contrib/intdiv/intdiv.h
Outdated
| return HWY_NAMESPACE::ComputeDivisorParams<T>(d); | ||
| } | ||
|
|
||
| template <typename T, HWY_IF_T_SIZE(T, 1), HWY_IF_SIGNED(T)> |
There was a problem hiding this comment.
Rather than SFINAE for T size 1..8, we can HWY_IF_T_SIZE_ONE_OF(T, (1 << 1) | (1 << 2) | (1 << 4) | (1 << 8)), or better yet, just static_assert IsPow2(sizeof(T)) within one function.
There was a problem hiding this comment.
ok got it , now collapse repeated size-based SFINAE overloads into single signed and unsigned overload using static_assert for supported integer sizes
hwy/contrib/intdiv/intdiv.h
Outdated
| return HWY_NAMESPACE::ComputeDivisorParams<T>(d); | ||
| } | ||
|
|
||
| template <class D, class V = VecD<D>, typename T = TFromD_<D>, HWY_IF_UNSIGNED_D(D)> |
There was a problem hiding this comment.
Here also we could static_assert(IsUnsigned()) inside the function, given that you have a DivisorParamsU argument.
hwy/contrib/intdiv/intdiv_test.cc
Outdated
| if constexpr (sizeof(T) <= 4) { | ||
| return static_cast<T>(Random32(&rng)); | ||
| } else { | ||
| const uint64_t hi = Random32(&rng); |
There was a problem hiding this comment.
We do have a Random64().
There was a problem hiding this comment.
Done , i have used Random64() now
hwy/contrib/intdiv/intdiv_test.cc
Outdated
| } | ||
|
|
||
| template <typename T> | ||
| bool IsPow2(T x) { |
There was a problem hiding this comment.
Already defined in intdiv-inl.h?
Signed-off-by: Abhishek Kumar <abhishek.r.kumar@fujitsu.com>
9b1a2f6 to
d728618
Compare
|
|
||
| template <> | ||
| struct MultiplierType<uint32_t> { | ||
| using type = uint32_t; |
There was a problem hiding this comment.
We can use using type = If<(sizeof(T) < 4), Relations::Wide, T>.
This change adds a contrib module implementing fast integer division by invariant (loop-constant) divisors using multiplication and shifts, following Granlund & Montgomery, “Division by Invariant Integers Using Multiplication” (PLDI 1994).
Supports all scalar lane widths and signs:
This contrib module provides general-purpose, cross-architecture implementation of division by invariant scalars using multiplication, suitable for vectorized code built on Highway. It mirrors the GM(Algo) scheme and is conceptually similar to the integer SIMD division intrinsics used in NumPy’s npyv_intdiv, but expressed purely in Highway’s portable SIMD API.