Skip to content

fix: cagra bug#4963

Open
Anarion-zuo wants to merge 5 commits intofacebookresearch:mainfrom
Anarion-zuo:aaronzuo/fix_cagra
Open

fix: cagra bug#4963
Anarion-zuo wants to merge 5 commits intofacebookresearch:mainfrom
Anarion-zuo:aaronzuo/fix_cagra

Conversation

@Anarion-zuo
Copy link
Copy Markdown

Summary
Resolved #4927.

  • Fixes base-level-only IndexHNSWCagra::range_search to return results without an entry point.
  • Adds a C++ unit test reproducing the base-level-only range search case and a GPU interop Python test.

Changes

  • Implement base-level-only range search path in IndexHNSWCagra using per-query range handlers.
  • Add C++ helper to copy level-0 graph only and a new test for base-level-only range search.
  • Add GPU interop Python test for base-level-only range search.

Tests

  • docker build -f docker/Dockerfile.dev -t faiss-dev:local --build-arg UID="$(id -u)" --build-arg GID="$(id -g)" .
  • docker run --rm -v "$PWD:/workspace" -w /workspace faiss-dev:local bash -lc "cmake -B build-docker -G Ninja -DFAISS_ENABLE_GPU=OFF -DFAISS_ENABLE_PYTHON=ON -DBUILD_TESTING=ON -DBUILD_SHARED_LIBS=ON -DCMAKE_BUILD_TYPE=RelWithDebInfo . && cmake --build build-docker -j --target faiss faiss_test swigfaiss && ./build-docker/tests/faiss_test --gtest_filter=HNSW.Test_IndexHNSWCagra_BaseLevelOnly_RangeSearch"

Notes

  • The Python GPU test is skipped without CUVS/GPU support; it is included for GPU-enabled CI.

Signed-off-by: aaronzuo <anarionzuo@outlook.com>
@navneet1v
Copy link
Copy Markdown
Contributor

@Anarion-zuo thanks for raising the PR. This is way better solution than mine. I had to make changes at multiple places. But this far more cleaner

ndis += stats.ndis;
nhops += stats.nhops;
res.end();
vt.advance();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this is needed? just for understanding?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VisitedTable vt is reused across queries inside the same parallel region for performance (to avoid reallocating/clearing a large visited array for every query). vt.advance() is what “resets” the visited state between queries in O(1) by bumping an epoch counter (or clearing the hashset variant), so each query starts with an empty visited set.

Without vt.advance(), nodes visited while processing query i would remain marked as visited for query i+1, which can cause the search to skip almost everything and return incorrect / empty results.

This is the same pattern used elsewhere in HNSW search code, e.g. in the main search loop that shares a VisitedTable across queries and calls advance() after each query.

@Anarion-zuo
Copy link
Copy Markdown
Author

@navneet1v Hi. It's been some time. Do I need a maintainer's approval to trigger the CI tests?

@Anarion-zuo
Copy link
Copy Markdown
Author

@navneet1v Hi. Are you a maintainer? Someone should trigger the CI on this one.

@navneet1v
Copy link
Copy Markdown
Contributor

@navneet1v Hi. Are you a maintainer? Someone should trigger the CI on this one.

I am not the maintainer

@navneet1v
Copy link
Copy Markdown
Contributor

taggin @mdouze

Signed-off-by: aaronzuo <anarionzuo@outlook.com>
@mnorris11
Copy link
Copy Markdown
Contributor

cc @cjnolet @tarang-jain

@meta-codesync
Copy link
Copy Markdown
Contributor

meta-codesync bot commented Mar 23, 2026

@mnorris11 has imported this pull request. If you are a Meta employee, you can view this in D97766638.

@mnorris11
Copy link
Copy Markdown
Contributor

@Anarion-zuo would you be able to fix the unit tests? Are the logs visible to you?

Signed-off-by: aaronzuo <anarionzuo@outlook.com>
@Anarion-zuo
Copy link
Copy Markdown
Author

@mnorris11 I fixed the visible part of the CI.

Signed-off-by: aaronzuo <anarionzuo@outlook.com>
@Anarion-zuo
Copy link
Copy Markdown
Author

@mnorris11 gentle ping

}

template <>
struct simd512bit_tpl<SIMDLevel::AVX512_SPR>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did the template have to change for AVX512 if the issue is related to CAGRA / HNSW? #5015 added SIMD dispatch for HNSW.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] IndexHNSWCagra with Base layer only return 0 results for range_search

4 participants