Back to browse

Bit-packed segmented prime sieve in Rust, 32KB working memory, 0 deps

by whisprer·Feb 14, 2026·3 points·1 comment

AI Analysis

●●●Banger
The Take

Impressively engineered low-level, bit-packed segmented sieve in Rust — a technical showcase with niche appeal to systems and algorithm enthusiasts.

Category
Target Audience

Systems programmers, Rust developers, performance engineers, algorithm enthusiasts

Post Description

Started with a hyper-compact C++ bit-packed sieve — months of whittling it down to ~9 lines of raw bit manipulation. One bit per odd number, Brian Kernighan's bit trick for extraction, hardware tzcnt intrinsics. Then I ported it to Rust.

The borrow checker had opinions. You can't .filter() over a bit array (immutable borrow) and .for_each() mutate it simultaneously. Fair enough — explicit loops for the sieve phase, iterators for collection. Same assembly output, provably safe. Along the way I caught two assertion bugs the sieve was too correct to trigger (off-by-one on the 10,001st prime; 499,999 is composite, not prime).

The flat sieve hit a wall at ~10M — the bit array exceeds L1 cache (32KB) and every sieving pass thrashes. So I built a segmented version: bootstrap sieving primes via flat sieve up to √n, then process the full range in 32KB L1-sized segments. One buffer, reused, never leaves cache.

Results (single-threaded, 25 iterations, cross-verified against primal crate):

n=500K: ~440µs, 30KB sieve mem n=10M: ~10ms, 32KB sieve mem n=50M: ~52ms, 32KB sieve mem (flat: ~67ms) n=100M: ~137ms, 32KB sieve mem (flat: ~190ms)

Compared to the primal crate (Rust's best-in-class), raw speed is ~2x behind — primal uses wheel-30 factorisation which skips multiples of 2, 3, and 5. This sieve only skips evens. But memory is where it wins: 187x less sieve working memory than primal at n=50M (32KB vs 6MB), and a tighter result vector from prime-counting pre-allocation.

The target use case is embedded/constrained: ESP32 nodes, Raspberry Pi Zeros, distributed timing networks where every KB matters. It's also a clean reference implementation — one file, zero dependencies, compiles with a single rustc invocation.

Single file. No Cargo. No crates. Just:

rustc -C opt-level=3 -C target-cpu=native primer.rs && ./primer

Code, benchmarks, borrow checker writeup, and a full development narrative are all in the repo.

Similar Projects

Open Source●●Solid

Sievers a Rust SIEVE filter editor

The repo delivers the two UX pieces you actually need: a visual rule builder plus a raw SIEVE editor and an IMAP connection dialog for direct uploads. It’s pragmatic — cross-compile notes, native build deps, and a tuned release profile (strip, LTO, size opt) show the author cared about distribution and binary size — but it’s solving a narrow, well-known problem rather than inventing one.

Niche GemSolve My Problem
oscarcp
103mo ago