⚠️ Historischer Auditbericht – Befunde ohne aktuellen Codebeleg mit<!-- TODO: add source file evidence -->markieren. Veraltete Befunde entfernen.
Last Audit: 2026-04-19 Auditor: Copilot Status: ✅ Pass (Beta)
| Metric | Result |
|---|---|
| Build System Registration | ✅ Verified |
| Source Files | 30 (.cpp in src/gpu/) |
| Test Coverage | ✅ All Phase 1–4 items complete; GPU paths tested with hardware skip |
| Open TODOs | 30 files contain TODOs (WASM runtime injection, cuVS production wiring) |
| Open Stubs | 2 (cuVS/RAFT production wiring pending; WASM runtime injection pending) |
| Security Issues | None |
- All GPU source files registered in
cmake/CMakeLists.txtandcmake/ModularBuild.cmake. - CUDA paths guarded by
THEMIS_ENABLE_CUDA. - ROCm/HIP paths guarded by
THEMIS_ENABLE_HIP. - cuVS/RAFT integration guarded by
THEMIS_ENABLE_CUVS. - NCCL/RCCL guarded by
THEMIS_ENABLE_NCCL. - WASM kernel sandbox guarded by
THEMIS_ENABLE_WASM. - Profiling integration guarded by
THEMIS_ENABLE_NVTXandTHEMIS_ENABLE_ROCTX.
| File | Purpose |
|---|---|
admin_api.cpp |
JSON admin API with tenant breakdown and dry-run |
alerts.cpp |
Threshold-based alert manager with callbacks |
audit_log.cpp |
Ring-buffer structured audit event log |
cluster_coordinator.cpp |
Multi-node GPU cluster coordination |
cluster_topology.cpp |
NVLink/InfiniBand topology awareness |
config.cpp |
GPU config validation with dry-run simulation |
device_discovery.cpp |
CUDA/ROCm device enumeration with CPU sentinel |
feature_flags.cpp |
Runtime GPU feature enable/disable |
gpu_memory_manager_edition.cpp |
Edition-aware VRAM with tenant quotas |
gpu_module.cpp |
Integration facade: policy → CB → alloc → launch |
graph_cache.cpp |
CUDA graph capture (LRU 32-entry, QueryShape key) |
kernel_validator.cpp |
FNV-1a checksum kernel whitelist |
launcher.cpp |
Named async GPU streams with CPU fallback budget |
load_balancer.cpp |
Multi-GPU: ROUND_ROBIN, LEAST_LOADED, FIRST_HEALTHY |
memory_pool.cpp |
Slab allocator with defragmentation |
metrics.cpp |
Prometheus-compatible counter/gauge metrics |
mig_manager.cpp |
NVIDIA MIG (Multi-Instance GPU) partition management |
p2p_transfer.cpp |
Peer-to-peer GPU memory transfer management |
policy.cpp |
GPU resource policy enforcement |
profiler.cpp |
NVTX/rocTX profiler marker integration |
query_accelerator.cpp |
GPU-accelerated scan/sort/aggregate/join/ANN |
rocm_backend.cpp |
AMD ROCm/HIP backend implementation |
safe_fail.cpp |
Safe-fail handler for GPU errors with CPU fallback |
stream_manager.cpp |
CUDA/HIP stream lifecycle management |
tensor_buffer.cpp |
GPU tensor buffer with pinned memory support |
time_slice_scheduler.cpp |
Time-slice scheduling for multi-tenant GPU isolation |
training_loop.cpp |
GPU training loop for on-device model fine-tuning |
unified_memory.cpp |
CUDA unified memory management for CPU-GPU transfers |
vulkan_backend.cpp |
Vulkan compute backend for cross-platform GPU support |
wasm_kernel_sandbox.cpp |
WASM sandbox for untrusted GPU kernel isolation |
tests/test_gpu_graph_cache.cpp— CUDA graph capture, LRU eviction, QueryShape keyingtests/test_gpu_query_accelerator.cpp— scan, sort, aggregate, hashJoin, annSearchtests/test_gpu_stream_manager.cpp— stream creation, CPU fallbacktests/test_gpu_memory_manager.cpp— tenant quota enforcement, fragmentation trackingtests/test_gpu_kernel_validator.cpp— FNV-1a whitelist, unknown kernel rejection- GPU hardware tests: skipped gracefully when no GPU is present in CI
- Kernel whitelist enforcement —
KernelValidator::validate()called on every kernel launch path; unknown kernels rejected before memory allocation. - Cross-tenant VRAM isolation — per-tenant quota enforcement in
GPUMemoryManagerEdition. - Circuit-breaker safe-fail — GPU errors trigger automatic CPU fallback; circuit breaker state exposed in admin API.
- Audit log for GPU operations — ring-buffer captures all allocation, deallocation, fallback, and admin events.
- cuVS/RAFT production wiring — ANN search has CPU brute-force fallback; cuVS/RAFT production integration stub (Issue #2381).
- WASM kernel sandbox runtime — WASM isolation infrastructure complete but requires concrete WasmRuntime injection (Issue #1572, same as base module).
- GPU memory zeroing overhead — zeroing on deallocation is enabled for security; can be disabled in performance mode (operator tradeoff).
- Per-tenant VRAM quotas support multi-tenant SaaS isolation requirements.
- Audit log for GPU operations supports SOC 2 resource access tracking.
- Dynamic time-slicing for multi-tenant isolation (Issue #1795) prevents noisy-neighbor DoS.
- WASM sandbox (when fully wired) will meet third-party code execution security requirements for marketplace plugins.