Vecdb – local-first hybrid vector database in Rust (HNSW and BM25)
Single-binary Rust DB fusing HNSW and BM25 without cloud dependencies or API keys.

Zig-inspired caching makes local vector search daemonless and incremental.
Backend developers
Ripgrep · LanceDB · Semantic Search CLI
Pipeline is Expansion → BM25/phrase/vector Retrieval → RRF Fusion → optional Qwen reranking. Each stage is independently tunable.
The part I found most interesting to build: the caching layer is modeled after Zig's build system. A BLAKE3 manifest store tracks filesystem metadata so sift knows which files changed without re-reading them. A content-addressable blob store holds pre-extracted text, BM25 term frequencies, and pre-embedded vectors — so repeat queries skip neural inference entirely and go straight to dot-product scoring. Identical files across projects share a single blob entry.
Benchmarked on SciFact (5,185 docs): vector hits 0.826 nDCG@10 with perfect recall at ~26ms p50. BM25 alone is 5ms if latency is the constraint.
Repo: github.com/rupurt/sift
Single-binary Rust DB fusing HNSW and BM25 without cloud dependencies or API keys.
Pure Rust CPU-only code search with persistent index beats transformer-heavy alternatives.
Hybrid search in-process — BM25 + vectors + RRF, zero external DB, validated on BEIR benchmarks.
Local SigLIP embeddings + 68K-term semantic tagging in a single Rust binary, zero cloud.
Hybrid search toggle per query when Pinecone charges for everything.
Shared memory across Claude Code, Cursor, Windsurf—solves agent drift via hybrid search and audit trails.