Skip to content

[Perf] Replace stats_mutex_ with lock-free atomic counter in HNSW search path #1720

@LHT129

Description

@LHT129

Description

Replace the exclusive stats_mutex_ lock with a lock-free solution using std::atomic in WindowResultQueue to eliminate lock contention in high-QPS concurrent search scenarios.

Background

The HNSW and DiskANN indices record statistics (search time, IO count, etc.) during each search operation for the GetStats() API. The current implementation uses an exclusive lock (std::mutex), which becomes a performance bottleneck in high-QPS concurrent search scenarios.

Problem Analysis

  1. Exclusive lock on every search: stats_mutex_ is a global exclusive lock, forcing all concurrent search threads to serialize access. In high-QPS scenarios (10,000+ QPS), this lock becomes a significant contention hotspot.
  2. Lightweight Push operation: WindowResultQueue::Push is very lightweight (a single array write + index increment), making the lock overhead disproportionate to the operation itself.
  3. Worse in DiskANN: Multiple Push operations are performed within a single lock-protected region.

Solution

Convert WindowResultQueue::count_ to std::atomic<uint64_t> and use fetch_add for atomic write position advancement.

Implementation Details

  • Use std::atomic<uint64_t> as the write index
  • Use fetch_add to atomically advance the write position
  • Use relaxed memory order for reads in GetStats()
  • Accept minor data races in statistics (acceptable since statistics are approximate values)

Changes

  • src/utils/window_result_queue.h: Added #include <atomic>, changed count_ to std::atomic<uint64_t>
  • src/utils/window_result_queue.cpp: Use fetch_add in Push(), use load(relaxed) in GetAvgResult()
  • src/index/hnsw.cpp: Removed stats_mutex_ protection in GetStats() and search methods
  • src/index/diskann.cpp: Removed stats_mutex_ protection in GetStats() and search methods

Expected Impact

  • Eliminate stats_mutex_ contention in high-concurrency search scenarios
  • Improved search throughput in multi-threaded benchmarks (actual improvement depends on thread count and QPS)
  • No impact on statistical data accuracy (minor data race-induced deviations acceptable)

Related

  • Original task: agent-hive/tasks/2026-03-13-替换-hnsw-stats-mutex-为无锁计数器.md
  • Reference implementation: src/utils/spsc_queue.h (existing lock-free SPSC queue)

Metadata

Metadata

Assignees

Labels

created-by-AIThe issue is found and created by AI Agent

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions