Commit 49bc911
SIMD optimization RaBitQ (#4515)
Summary:
This diff introduces a new file rabitq_simd.h with multiple SIMD-optimized implementations of the dot product calculation using population count (popcnt) operations:
1. AVX-512 implementation with AVX512VPOPCNTDQ: Processes data in 512-bit (64-byte) chunks using dedicated AVX-512 popcnt instructions, with fallbacks to smaller vector sizes for remaining data.
2. AVX-512 fallback implementation without AVX512VPOPCNTDQ: Uses AVX512F instructions with a lookup-based popcount method for 512-bit vectors, falling back to smaller vectors for remaining data.
3. AVX2 implementation: Uses a lookup-based popcount method with 256-bit (32-byte) AVX2 instructions, handling leftovers with 128-bit SSE operations and scalar processing.
4. Scalar fallback: Processes data in 64-bit chunks with builtin popcount operations for systems without SIMD support.
Differential Revision: D793016071 parent 8482842 commit 49bc911
2 files changed
Lines changed: 543 additions & 24 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
10 | 13 | | |
11 | 14 | | |
12 | 15 | | |
13 | 16 | | |
14 | 17 | | |
15 | 18 | | |
16 | 19 | | |
17 | | - | |
18 | | - | |
19 | | - | |
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| |||
351 | 351 | | |
352 | 352 | | |
353 | 353 | | |
354 | | - | |
355 | | - | |
356 | | - | |
357 | | - | |
358 | | - | |
359 | | - | |
360 | | - | |
361 | | - | |
362 | | - | |
363 | | - | |
364 | | - | |
365 | | - | |
366 | | - | |
367 | | - | |
368 | | - | |
369 | | - | |
370 | | - | |
371 | | - | |
372 | | - | |
373 | | - | |
374 | | - | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
375 | 357 | | |
376 | 358 | | |
377 | 359 | | |
| |||
0 commit comments