Sakha – An AI employee – onboarding tool for businesses
Knowledge gap detection flags missing docs when multiple hires ask the same question.
Local-first context ingestion and retrieval for AI tools. SQLite + embeddings + MCP server for Cursor & Claude.
Local RAG without cloud: sync your codebase, search hybrid, feed Cursor via MCP.
Backend developers, AI tool power users managing complex multi-repo projects
Ollama · Perplexity (local context) · Retrieval-augmented generation tools like LlamaIndex
I built this because I kept hitting the same problem: AI tools are powerful but have no memory of my complex multi-repo project. They can't search our internal docs, past incidents, or architecture decisions. Cloud RAG services exist, but they're complex, expensive, and your data leaves your machine. I wanted something I could point at my sources and just run `ctx sync all`.
Quick start:
# Install (pre-built binaries available for macOS/Linux/Windows) cargo install --git https://github.com/parallax-labs/context-harness.git
# Create config and initialize ctx init
# Sync your data sources (filesystem, Git, S3, or Lua scripts) ctx sync all
# Search from CLI ctx search "how does the auth service validate tokens"
# Or start the MCP server for Cursor/Claude Desktop ctx serve mcp
What it does differently from other RAG tools:- *Truly local*: SQLite + single binary. No Docker, no Postgres, no cloud. Local embeddings (bundled or pure-Rust) so semantic and hybrid search work with zero API keys. Back up your entire knowledge base with `cp ctx.sqlite ctx.sqlite.bak`. - *Hybrid search*: FTS5 keyword scoring + cosine vector similarity with configurable blending. Works without embeddings too (keyword-only mode); with local embeddings you get full hybrid search offline. - *Lua extensibility*: Write custom connectors, tools, and agents in Lua without recompiling anything. The Lua VM has HTTP, JSON, crypto, and filesystem APIs built in. - *Extension registry*: `ctx registry init` installs a Git-backed community registry with 10 connectors (Jira, Confluence, Slack, Notion, RSS, Stack Overflow, Linear, etc.), 4 MCP tools, and 2 agent personas. - *MCP protocol*: Cursor, Claude Desktop, Continue.dev, and any MCP-compatible client can connect and search your knowledge base directly.
Embeddings: you can run *fully offline* — the default build uses local embeddings (fastembed with bundled ONNX on most platforms, or a pure-Rust tract path on Linux musl and Intel Mac). No API key required. Optional: Ollama (local LLM stack) or OpenAI if you prefer. Keyword-only mode needs zero deps. There's no built-in auth layer; it's designed for local or trusted network use.
Stack: Rust, SQLite (WAL mode), FTS5, mlua (Lua 5.4), axum, MCP Streamable HTTP. MIT licensed.
GitHub: https://github.com/parallax-labs/context-harness
Docs: https://parallax-labs.github.io/context-harness/
Community Registry: https://github.com/parallax-labs/ctx-registry
If you find it useful, a star on GitHub is always appreciated.
Would love feedback on the search quality tuning (hybrid alpha, candidate counts) and the Lua extension model.
Knowledge gap detection flags missing docs when multiple hires ask the same question.
Tags @promptless on PRs; it drafts docs without leaving GitHub or signing up.
Ticket-triggered AI engineer when Cursor, Devin, and Sweep already own this category.
Email AI with multi-tool context, but 'smart email' is a crowded category.
It compares the last 7 days to a 30-day baseline, flags spikes in workaround language, expectation gaps and escalations, and pushes Slack alerts with real customer quotes — the contextual snippets are the product's strongest hook. That said, the page glosses over classifier accuracy, tuning controls, and triage links; in a crowded space this'll live or die on precision and how easily teams can act on the alerts.
608kb Rust binary, but Git already handles snapshots better.