Back to browse

VectorLens – See why your RAG hallucinates, no config

by gustav-proxi·Mar 9, 2026·1 point·0 comments

AI Analysis

●●SolidBig BrainDark Horse

Zero-config RAG tracing when LangSmith needs heavy instrumentation.

Strengths
  • Monkey-patching common clients means zero code changes to existing pipelines
  • Local hallucination detection with sentence-transformers keeps data private
  • Auto-intercepts OpenAI, Anthropic, Gemini, ChromaDB, FAISS without config
Weaknesses
  • Monkey-patching is fragile across library version updates
  • RAG observability space already has LangSmith, Arize, Helicone
Category
Target Audience

ML engineers building RAG applications

Similar To

LangSmith · Arize Phoenix · Helicone

Post Description

I built VectorLens because I was tired of "log file archaeology" every time my RAG pipeline hallucinated. Usually, when an LLM gives a wrong answer, you're stuck guessing which retrieved chunk misled it—or why the right chunk was ignored.

Existing observability tools either require a cloud signup, an enterprise contract, or heavy manual instrumentation of your code. I wanted something that stayed local and just worked.

The Solution: Three lines of code

Python import vectorlens vectorlens.serve() # Open http://127.0.0.1:7756 # Your RAG code runs as-is (OpenAI, Anthropic, Gemini, ChromaDB, FAISS, etc. are auto-intercepted) How it works technically:

Zero-Config Interception: It monkey-patches common LLM and Vector DB clients. You don't have to change your functions or wrap your calls; it intercepts the data flow automatically.

Local Hallucination Detection: It uses sentence-transformers (a 22MB model) to compare the LLM’s output sentences against the retrieved context. If the similarity is too low, it's flagged as a hallucination.

Perturbation Attribution: To figure out "why," it measures how the output changes when specific chunks are removed or modified. This gives you a clear score of which data points actually drove the response.

Fully Local: No data leaves your machine. The dashboard is a local React app updated via WebSockets.

Why use this over other tools?

Privacy: No cloud uploads or API keys for the debugger itself.

No Vendor Lock-in: Works with local models (Ollama/Mistral) just as easily as it does with GPT-4.

Speed: It runs detection in a background thread, so it doesn't block your main application logic.

I’m looking for feedback on the attribution accuracy and if there are specific Vector DBs you'd like to see supported next.

GitHub: https://github.com/Gustav-Proxi/vectorlens

Similar Projects

AI/MLMid

RAG-LCC – config-driven RAG framework for fast experimentation

Focuses on pre-retrieval document classification to fix context quality, not just embedding search.

Niche GemShip It
HarinezumIgel
201mo ago
AI/ML●●●Banger

Legal RAG Bench

Legal RAG benchmark revealing embedding quality > LLM choice by 19-point margin.

Big BrainNiche GemSolve My Problem
beowa
413mo ago