Add multi-DB stress testing support (--num_dbs flag) (#14749)#14749
Add multi-DB stress testing support (--num_dbs flag) (#14749)#14749hx235 wants to merge 1 commit into
Conversation
|
@hx235 has exported this pull request. If you are a Meta employee, you can view the originating Diff in D104959942. |
❌ clang-tidy: 1 error(s) and 7 warning(s) on changed linesCompleted in 151.3s. Summary by check
Details
|
Summary: Add --num_dbs flag to run N independent DB instances in parallel. Each StressTest instance has its own DB with isolated fault injection (from Diff 2). db_crashtest.py defaults to num_dbs=1. Path handling design: crash_main functions (blackbox/whitebox) compute specific per-DB paths from a base directory using get_subdirectory_paths() -- pure path computation, no I/O. gen_cmd validates path count matches num_dbs, creates EV directories, and handles destroy_db_initially for expected state. Both db and expected_values_dir follow the same validate logic in gen_cmd. If user passes specific paths (comma-separated), they are validated but not expanded. Shared resources (block_cache, write_buffer_manager, rate_limiter) are created once in tool.cc and shared across all DB instances. Most flags (threads, max_key, ops_per_thread) are per-DB. Other changes: - --num_dbs flag (default 1) with guards for not-yet-supported features - [db_label] prefix in all stdout output including stats (e.g. [db_0] Stress Test: ...) - Per-DB crash callback vector (fault_fs_for_crash_report, raw pointer, signal-safe) - Fault injection log path includes GetDbLabel() for per-DB uniqueness - Secondary paths are ephemeral, generated directly by C++ - flockfile/funlockfile for multi-DB stats atomicity Differential Revision: D104959942
Summary: Add --num_dbs flag to run N independent DB instances in parallel. Each StressTest instance has its own DB with isolated fault injection (from Diff 2). db_crashtest.py defaults to num_dbs=1. Path handling design: crash_main functions (blackbox/whitebox) compute specific per-DB paths from a base directory using get_subdirectory_paths() -- pure path computation, no I/O. gen_cmd validates path count matches num_dbs, creates EV directories, and handles destroy_db_initially for expected state. Both db and expected_values_dir follow the same validate logic in gen_cmd. If user passes specific paths (comma-separated), they are validated but not expanded. Shared resources (block_cache, write_buffer_manager, rate_limiter) are created once in tool.cc and shared across all DB instances. Most flags (threads, max_key, ops_per_thread) are per-DB. Other changes: - --num_dbs flag (default 1) with guards for not-yet-supported features - [db_label] prefix in all stdout output including stats (e.g. [db_0] Stress Test: ...) - Per-DB crash callback vector (fault_fs_for_crash_report, raw pointer, signal-safe) - Fault injection log path includes GetDbLabel() for per-DB uniqueness - Secondary paths are ephemeral, generated directly by C++ - flockfile/funlockfile for multi-DB stats atomicity Differential Revision: D104959942
Summary: Add --num_dbs flag to run N independent DB instances in parallel. Each StressTest instance has its own DB with isolated fault injection (from Diff 2). db_crashtest.py defaults to num_dbs=1. Path handling design: crash_main functions (blackbox/whitebox) compute specific per-DB paths from a base directory using get_subdirectory_paths() -- pure path computation, no I/O. gen_cmd validates path count matches num_dbs, creates EV directories, and handles destroy_db_initially for expected state. Both db and expected_values_dir follow the same validate logic in gen_cmd. If user passes specific paths (comma-separated), they are validated but not expanded. Shared resources (block_cache, write_buffer_manager, rate_limiter) are created once in tool.cc and shared across all DB instances. Most flags (threads, max_key, ops_per_thread) are per-DB. Other changes: - --num_dbs flag (default 1) with guards for not-yet-supported features - [db_label] prefix in all stdout output including stats (e.g. [db_0] Stress Test: ...) - Per-DB crash callback vector (fault_fs_for_crash_report, raw pointer, signal-safe) - Fault injection log path includes GetDbLabel() for per-DB uniqueness - Secondary paths are ephemeral, generated directly by C++ - flockfile/funlockfile for multi-DB stats atomicity Differential Revision: D104959942
Summary: Add --num_dbs flag to run N independent DB instances in parallel. Each StressTest instance has its own DB with isolated fault injection (from Diff 2). db_crashtest.py defaults to num_dbs=1. Path handling design: crash_main functions (blackbox/whitebox) compute specific per-DB paths from a base directory using get_subdirectory_paths() -- pure path computation, no I/O. gen_cmd validates path count matches num_dbs, creates EV directories, and handles destroy_db_initially for expected state. Both db and expected_values_dir follow the same validate logic in gen_cmd. If user passes specific paths (comma-separated), they are validated but not expanded. Shared resources (block_cache, write_buffer_manager, rate_limiter) are created once in tool.cc and shared across all DB instances. Most flags (threads, max_key, ops_per_thread) are per-DB. Other changes: - --num_dbs flag (default 1) with guards for not-yet-supported features - [db_label] prefix in all stdout output including stats (e.g. [db_0] Stress Test: ...) - Per-DB crash callback vector (fault_fs_for_crash_report, raw pointer, signal-safe) - Fault injection log path includes GetDbLabel() for per-DB uniqueness - Secondary paths are ephemeral, generated directly by C++ - flockfile/funlockfile for multi-DB stats atomicity Differential Revision: D104959942
4de8a83 to
b072a0e
Compare
Summary: Add --num_dbs flag to run N independent DB instances in parallel. Each StressTest instance has its own DB with isolated fault injection (from Diff 2). db_crashtest.py defaults to num_dbs=1. Path handling design: crash_main functions (blackbox/whitebox) compute specific per-DB paths from a base directory using get_subdirectory_paths() -- pure path computation, no I/O. gen_cmd validates path count matches num_dbs, creates EV directories, and handles destroy_db_initially for expected state. Both db and expected_values_dir follow the same validate logic in gen_cmd. If user passes specific paths (comma-separated), they are validated but not expanded. Shared resources (block_cache, write_buffer_manager, rate_limiter) are created once in tool.cc and shared across all DB instances. Most flags (threads, max_key, ops_per_thread) are per-DB. Other changes: - --num_dbs flag (default 1) with guards for not-yet-supported features - [db_label] prefix in all stdout output including stats (e.g. [db_0] Stress Test: ...) - Per-DB crash callback vector (fault_fs_for_crash_report, raw pointer, signal-safe) - Fault injection log path includes GetDbLabel() for per-DB uniqueness - Secondary paths are ephemeral, generated directly by C++ - flockfile/funlockfile for multi-DB stats atomicity Differential Revision: D104959942
Summary: Add --num_dbs flag to run N independent DB instances in parallel. Each StressTest instance has its own DB with isolated fault injection (from Diff 2). db_crashtest.py defaults to num_dbs=1. Path handling design: crash_main functions (blackbox/whitebox) compute specific per-DB paths from a base directory using get_subdirectory_paths() -- pure path computation, no I/O. gen_cmd validates path count matches num_dbs, creates EV directories, and handles destroy_db_initially for expected state. Both db and expected_values_dir follow the same validate logic in gen_cmd. If user passes specific paths (comma-separated), they are validated but not expanded. Shared resources (block_cache, write_buffer_manager, rate_limiter) are created once in tool.cc and shared across all DB instances. Most flags (threads, max_key, ops_per_thread) are per-DB. Other changes: - --num_dbs flag (default 1) with guards for not-yet-supported features - [db_label] prefix in all stdout output including stats (e.g. [db_0] Stress Test: ...) - Per-DB crash callback vector (fault_fs_for_crash_report, raw pointer, signal-safe) - Fault injection log path includes GetDbLabel() for per-DB uniqueness - Secondary paths are ephemeral, generated directly by C++ - flockfile/funlockfile for multi-DB stats atomicity Differential Revision: D104959942
Summary: Add --num_dbs flag to run N independent DB instances in parallel. Each StressTest instance has its own DB with isolated fault injection (from Diff 2). db_crashtest.py defaults to num_dbs=1. Path handling design: crash_main functions (blackbox/whitebox) compute specific per-DB paths from a base directory using get_subdirectory_paths() -- pure path computation, no I/O. gen_cmd validates path count matches num_dbs, creates EV directories, and handles destroy_db_initially for expected state. Both db and expected_values_dir follow the same validate logic in gen_cmd. If user passes specific paths (comma-separated), they are validated but not expanded. Shared resources (block_cache, write_buffer_manager, rate_limiter) are created once in tool.cc and shared across all DB instances. Most flags (threads, max_key, ops_per_thread) are per-DB. Other changes: - --num_dbs flag (default 1) with guards for not-yet-supported features - [db_label] prefix in all stdout output including stats (e.g. [db_0] Stress Test: ...) - Per-DB crash callback vector (fault_fs_for_crash_report, raw pointer, signal-safe) - Fault injection log path includes GetDbLabel() for per-DB uniqueness - Secondary paths are ephemeral, generated directly by C++ - flockfile/funlockfile for multi-DB stats atomicity Differential Revision: D104959942
Summary: Add --num_dbs flag to run N independent DB instances in parallel. Each StressTest instance has its own DB with isolated fault injection (from Diff 2). db_crashtest.py defaults to num_dbs=1. Path handling design: crash_main functions (blackbox/whitebox) compute specific per-DB paths from a base directory using get_subdirectory_paths() -- pure path computation, no I/O. gen_cmd validates path count matches num_dbs, creates EV directories, and handles destroy_db_initially for expected state. Both db and expected_values_dir follow the same validate logic in gen_cmd. If user passes specific paths (comma-separated), they are validated but not expanded. Shared resources (block_cache, write_buffer_manager, rate_limiter) are created once in tool.cc and shared across all DB instances. Most flags (threads, max_key, ops_per_thread) are per-DB. Other changes: - --num_dbs flag (default 1) with guards for not-yet-supported features - [db_label] prefix in all stdout output including stats (e.g. [db_0] Stress Test: ...) - Per-DB crash callback vector (fault_fs_for_crash_report, raw pointer, signal-safe) - Fault injection log path includes GetDbLabel() for per-DB uniqueness - Secondary paths are ephemeral, generated directly by C++ - flockfile/funlockfile for multi-DB stats atomicity Differential Revision: D104959942
🟡 Codex Code ReviewAuto-triggered after CI passed — reviewing commit e775a94 ❌ Codex review failed before producing findings. ℹ️ About this responseGenerated by Codex CLI. Limitations:
Commands:
|
✅ Claude Code ReviewAuto-triggered after CI passed — reviewing commit e775a94 SummaryClean, well-structured PR that adds multi-DB stress testing with proper resource sharing and isolation. The design is sound -- per-DB paths, shared cache/WBM/rate_limiter, per-DB fault injection, and clear phase separation in the orchestration code. Most incompatible features are correctly guarded. High-severity findings (1):
Full review (click to expand)Findings🔴 HIGHH1.
|
| Shared Resource | Thread-Safe? | Guarded? |
|---|---|---|
| block_cache | Yes | N/A (safe to share) |
| WriteBufferManager | Yes | N/A (safe to share) |
| RateLimiter | Yes | N/A (safe to share) |
| CompressedSecondaryCache | Yes | Blocked with num_dbs>1 |
| PoolSizeChangeThread | N/A | Blocked with num_dbs>1 |
| SyncPoint (debug) | Fragile | Safe via destruction order |
Positive Observations
- Clean 4-phase orchestration design (create -> init -> run -> cleanup)
- Proper shared resource factoring (cache, WBM, rate limiter)
- Good feature incompatibility guards in both C++ and Python
- Signal-safe crash callback with vector of fault_fs pointers
[db_N]output labeling for easy per-DB filtering
ℹ️ About this response
Generated by Claude Code.
Review methodology: claude_md/code_review.md
Limitations:
- Claude may miss context from files not in the diff
- Large PRs may be truncated
- Always apply human judgment to AI suggestions
Commands:
/claude-review [context]— Request a code review/claude-query <question>— Ask about the PR or codebase
Summary: Add --num_dbs flag to run N independent DB instances in parallel. Each StressTest instance has its own DB with isolated fault injection (from Diff 2). db_crashtest.py defaults to num_dbs=1. Path handling design: crash_main functions (blackbox/whitebox) compute specific per-DB paths from a base directory using get_subdirectory_paths() -- pure path computation, no I/O. gen_cmd validates path count matches num_dbs, creates EV directories, and handles destroy_db_initially for expected state. Both db and expected_values_dir follow the same validate logic in gen_cmd. If user passes specific paths (comma-separated), they are validated but not expanded. Shared resources (block_cache, write_buffer_manager, rate_limiter) are created once in tool.cc and shared across all DB instances. Most flags (threads, max_key, ops_per_thread) are per-DB. Other changes: - --num_dbs flag (default 1) with guards for not-yet-supported features - [db_label] prefix in all stdout output including stats (e.g. [db_0] Stress Test: ...) - Per-DB crash callback vector (fault_fs_for_crash_report, raw pointer, signal-safe) - Fault injection log path includes GetDbLabel() for per-DB uniqueness - Secondary paths are ephemeral, generated directly by C++ - flockfile/funlockfile for multi-DB stats atomicity Differential Revision: D104959942
Summary: Add --num_dbs flag to run N independent DB instances in parallel. Each StressTest instance has its own DB with isolated fault injection (from Diff 2). db_crashtest.py defaults to num_dbs=1. Path handling design: crash_main functions (blackbox/whitebox) compute specific per-DB paths from a base directory using get_subdirectory_paths() -- pure path computation, no I/O. gen_cmd validates path count matches num_dbs, creates EV directories, and handles destroy_db_initially for expected state. Both db and expected_values_dir follow the same validate logic in gen_cmd. If user passes specific paths (comma-separated), they are validated but not expanded. Shared resources (block_cache, write_buffer_manager, rate_limiter) are created once in tool.cc and shared across all DB instances. Most flags (threads, max_key, ops_per_thread) are per-DB. Other changes: - --num_dbs flag (default 1) with guards for not-yet-supported features - [db_label] prefix in all stdout output including stats (e.g. [db_0] Stress Test: ...) - Per-DB crash callback vector (fault_fs_for_crash_report, raw pointer, signal-safe) - Fault injection log path includes GetDbLabel() for per-DB uniqueness - Secondary paths are ephemeral, generated directly by C++ - flockfile/funlockfile for multi-DB stats atomicity Differential Revision: D104959942
Summary:
Add --num_dbs flag to run N independent DB instances in parallel. Each StressTest instance has its own DB with isolated fault injection (from Diff 2). db_crashtest.py defaults to num_dbs=1.
Path handling design: crash_main functions (blackbox/whitebox) compute specific per-DB paths from a base directory using get_subdirectory_paths() -- pure path computation, no I/O. gen_cmd validates path count matches num_dbs, creates EV directories, and handles destroy_db_initially for expected state. Both db and expected_values_dir follow the same validate logic in gen_cmd. If user passes specific paths (comma-separated), they are validated but not expanded.
Shared resources (block_cache, write_buffer_manager, rate_limiter) are created once in tool.cc and shared across all DB instances. Most flags (threads, max_key, ops_per_thread) are per-DB.
Other changes:
Differential Revision: D104959942