19 examples covering BLAS Level 1, 2, and 3 operations — from scalar vector ops to batched mixed-precision matrix multiply.
Enable with -Dcublas=true.
zig build run-cublas-<name> -Dcublas=true
# Examples:
zig build run-cublas-gemm -Dcublas=true
zig build run-cublas-axpy -Dcublas=true| Example | File | BLAS Op | Description |
|---|---|---|---|
axpy |
axpy.zig | SAXPY |
y = α·x + y |
dot |
dot.zig | SDOT |
Dot product: Σ xᵢyᵢ |
nrm2_asum |
nrm2_asum.zig | SNRM2, SASUM |
L2 norm and L1 norm |
scal |
scal.zig | SSCAL |
Vector scaling: x = α·x |
amax_amin |
amax_amin.zig | ISAMAX, ISAMIN |
Index of max/min absolute value |
swap_copy |
swap_copy.zig | SSWAP, SCOPY |
Vector swap and copy |
rot |
rot.zig | SROT, SROTG |
Givens plane rotation |
cosine_similarity |
cosine_similarity.zig | SDOT, SNRM2 |
Cosine similarity via L1 ops |
| Example | File | BLAS Op | Description |
|---|---|---|---|
gemv |
gemv.zig | SGEMV |
y = α·A·x + β·y (matrix-vector) |
symv_syr |
symv_syr.zig | SSYMV, SSYR |
Symmetric matrix-vector ops |
trmv_trsv |
trmv_trsv.zig | STRMV, STRSV |
Triangular multiply and solve |
| Example | File | BLAS Op | Description |
|---|---|---|---|
gemm |
gemm.zig | SGEMM |
C = α·A·B + β·C (single precision) |
gemm_batched |
gemm_batched.zig | SGEMMStridedBatched |
Strided batched GEMM |
gemm_ex |
gemm_ex.zig | cublasGemmEx |
Mixed-precision GEMM (FP16 in, FP32 out) |
symm |
symm.zig | SSYMM |
Symmetric matrix multiply: C = α·A·B + β·C |
trsm |
trsm.zig | STRSM |
Triangular solve: op(A)·X = α·B |
syrk |
syrk.zig | SSYRK |
Symmetric rank-k update: C = α·A·Aᵀ + β·C |
geam |
geam.zig | SGEAM |
Matrix add / scale / transpose |
dgmm |
dgmm.zig | SDGMM |
Diagonal matrix multiply |
cuBLAS uses column-major storage. For row-major buffers (Zig/C default), swap A↔B and M↔N in sgemm:
// ✅ Correct for row-major C = A × B (M×N = M×K × K×N):
try blas.sgemm(.no_transpose, .no_transpose,
n, m, k, // ← swap n and m
alpha,
b, n, // ← B first, leading dim n
a, k, // ← A second, leading dim k
beta, c, n);See docs/cublas/README.md for the full explanation.
→ Full API reference: docs/cublas/README.md