v0.2.0 — Primer & Full Documentation

The bit-packed prime sieve goes cache-aware. This release adds a segmented sieve variant that processes the range in L1-sized 32 KB windows, complete benchmark harness, and a full documentation suite covering everything from the original C++ origins to production deployment patterns.

Highlights

Primer (seg.rs) — L1-cache-friendly processing in 32 KB segments. Single reused buffer, never leaves cache. 1.4× faster than the flat sieve at n=100M, with 187× less sieve working memory than the primal crate. Zero dependencies, single file, compiles with rustc alone.

rustc -C opt-level=3 -C target-cpu=native seg.rs && ./seg

Benchmark harness (prime_bench.rs) — Head-to-head comparison against primes v0.3 and primal v0.3 across six orders of magnitude (10K–100M). 25 iterations per measurement, cross-verified prime counts, full statistical output (min/median/mean/max/σ).

Bug fixes in the original flat sieve — Two incorrect test assertions that the sieve itself was too correct to trigger:

primes[10_000] is 104,743, not 104,729 (off-by-one in zero-indexed lookup)
499,999 is composite, not prime. Largest prime ≤ 500,000 is 499,979

Performance (Median, 25 Iterations)

n	wofl flat	wofl segmented	primal	Seg vs flat
500K	434 µs	443 µs	170 µs	~1×
10M	10.3 ms	10.3 ms	3.2 ms	~1× (crossover)
50M	66.6 ms	52.5 ms	30.7 ms	1.27×
100M	189.7 ms	137.0 ms	61.7 ms	1.38×

Building

# Primer (standalone — no Cargo needed) rustc -C opt-level=3 -C target-cpu=native seg.rs -o seg ./seg Tests rustc --test seg.rs -o test_seg && ./test_seg Benchmark harness (requires Cargo) cargo new all_bench && cd prime_bench Copy all_bench.rs → src/main.rs, Cargo.toml in place

cargo run --release

What's Next

Wheel-30 factorisation (~1.8× speedup, would close the primal gap to ~15%)
Parallel segments via rayon (~N× on N cores, segments are independent)
Compile-time generation for fixed small prime tables on embedded targets

Full changelog: CHANGELOG.md

# v0.2.0 — Primer & Full Documentation

Highlights

rustc -C opt-level=3 -C target-cpu=native seg.rs && ./seg

Bug fixes in the original flat sieve — Two incorrect test assertions that the sieve itself was too correct to trigger:

primes[10_000] is 104,743, not 104,729 (off-by-one in zero-indexed lookup)
499,999 is composite, not prime. Largest prime ≤ 500,000 is 499,979

Performance (Median, 25 Iterations)

n	wofl flat	wofl segmented	primal	Seg vs flat
500K	434 µs	443 µs	170 µs	~1×
10M	10.3 ms	10.3 ms	3.2 ms	~1× (crossover)
50M	66.6 ms	52.5 ms	30.7 ms	1.27×
100M	189.7 ms	137.0 ms	61.7 ms	1.38×

Memory (n = 50M)

	wofl segmented	primal	Ratio
Sieve working memory	32 KB	6.0 MB	187× smaller
Result vector	24.7 MB	32.0 MB	1.3× smaller
Total	24.8 MB	38.0 MB	1.5× smaller

Flat Sieve Improvements (v0.1.0 → v0.2.0)

Pre-allocated result vector via prime-counting upper bound (zero reallocs)
Overflow-safe integer isqrt() using checked_mul (correct for all u64 including u64::MAX)
Hoisted inner-loop step computation
Early termination in collection phase

Documentation

This release includes the full doc suite:

Document	Contents
`PERFORMANCE_ANALYSIS.md`	Benchmarks, scaling behaviour, memory deep dive, target use cases
`SIDE_BY_SIDE.md`	Line-by-line C++ → Rust translation guide
`BORROW_CHECKER_FIX.md`	Why the iterator approach was rejected and how explicit loops fix it
`writeup.md`	Full development narrative — from C++ origins through primer
`CHANGELOG.md`	Version history
`CONTRIBUTING.md`	Build/test workflow, performance contribution standards
`SECURITY.md`	Scope, known limitations, reporting

Files

File	Description
`primer.rs`	Primer — standalone, zero deps, `rustc` compilable
`og.rs`	Flat sieve — optimised from v0.1.0
`all_bench.rs`	Benchmark harness (requires Cargo + `primes`/`primal` crates)
`Cargo.toml`	Manifest for the benchmark harness
`primes.cpp`	The original 5-line C++ that started it all

Building

# primer (standalone — no Cargo needed)
rustc -C opt-level=3 -C target-cpu=native seg.rs -o seg
./seg

# Tests
rustc --test seg.rs -o test_seg && ./test_seg

# Benchmark harness (requires Cargo)
cargo new og_bench && cd prime_bench
# Copy og_bench.rs → src/main.rs, Cargo.toml in place
cargo run --release

What's Next

Wheel-30 factorisation (~1.8× speedup, would close the primal gap to ~15%)
Parallel segments via rayon (~N× on N cores, segments are independent)
Compile-time generation for fixed small prime tables on embedded targets

Full changelog: [CHANGELOG.md](https://claude.ai/chat/CHANGELOG.md)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Primer v0.2.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

v0.2.0 — Primer & Full Documentation

Highlights

Performance (Median, 25 Iterations)

Building

Tests

Benchmark harness (requires Cargo)

Copy all_bench.rs → src/main.rs, Cargo.toml in place

What's Next

Highlights

Performance (Median, 25 Iterations)

Memory (n = 50M)

Flat Sieve Improvements (v0.1.0 → v0.2.0)

Documentation

Files

Building

What's Next

Uh oh!