Back to browse
Sift – local hybrid search CLI in a single Rust binary

Sift – local hybrid search CLI in a single Rust binary

by rupurt·Mar 10, 2026·1 point·0 comments

AI Analysis

●●SolidBig BrainNiche Gem

Zig-inspired caching makes local vector search daemonless and incremental.

Strengths
  • BLAKE3 manifest store tracks file changes without re-reading or re-embedding content.
  • Single binary runs full BM25 plus vector pipeline without background daemons required.
  • Content-addressable blob store allows identical files across projects to share entries.
Weaknesses
  • Benchmarks rely on SciFact dataset, needs validation on large real-world codebases.
  • Rust CLI limits accessibility for non-technical users compared to GUI tools.
Target Audience

Backend developers

Similar To

Ripgrep · LanceDB · Semantic Search CLI

Post Description

Built this for agentic workflows where you need repeatable, low-latency search over local codebases and docs without standing up a daemon or an indexing service.

Pipeline is Expansion → BM25/phrase/vector Retrieval → RRF Fusion → optional Qwen reranking. Each stage is independently tunable.

The part I found most interesting to build: the caching layer is modeled after Zig's build system. A BLAKE3 manifest store tracks filesystem metadata so sift knows which files changed without re-reading them. A content-addressable blob store holds pre-extracted text, BM25 term frequencies, and pre-embedded vectors — so repeat queries skip neural inference entirely and go straight to dot-product scoring. Identical files across projects share a single blob entry.

Benchmarked on SciFact (5,185 docs): vector hits 0.826 nDCG@10 with perfect recall at ~26ms p50. BM25 alone is 5ms if latency is the constraint.

Repo: github.com/rupurt/sift

Similar Projects

Developer Tools●●Solid

Photon – Rust pipeline that embeds/tags/hashes images locally w SigLIP

Local SigLIP embeddings + 68K-term semantic tagging in a single Rust binary, zero cloud.

WizardryNiche GemShip It
pgbouncer
313mo ago