GitHub Repository

Real-time LLM hallucination guardrail — NLI + RAG fact-checking with token-level streaming halt. Drop-in for any LLM backend.

1 starsPython

Director-AI – token-level NLI+RAG

Name: Director-AI – token-level NLI+RAG
Availability: InStock
Author: anulum

by anulum·Feb 26, 2026·2 points·7 comments

Visit Project View on HN

AI Analysis

●●●BangerBig BrainWizardry

Token-level streaming halt stops hallucinations mid-sentence before user sees them—genuinely novel safety layer.

Strengths

•Dual-entropy scoring (DeBERTa NLI + ChromaDB RAG) is architecturally clever—not just post-hoc review but live gating
•Works with any OpenAI-compatible backend (Ollama, Groq, Claude)—genuine drop-in guardrail, not vendor lock-in
•Streaming kernel design with Rust interlock is non-obvious—66.2% balanced accuracy with full transparency on methodology

Weaknesses

•NLI + RAG alone don't solve confidently wrong answers in unfamiliar domains—knowledge base coverage is the real bottleneck
•Dual licensing (AGPL + commercial) creates friction for closed-source teams—adoption may hit walls

Post Description

Hey HN,

After watching too many agents confidently lie in production, I built Director-AI.

It sits between your LLM and the user, scoring every generated token with: • 0.6× DeBERTa-v3 NLI (contradiction detection) • 0.4× RAG against your own ChromaDB knowledge base

If coherence < threshold → Rust kernel halts the stream before the token is sent.

Key technical bits: • Works with any OpenAI-compatible endpoint (Ollama, vLLM, llama.cpp, Groq, OpenAI, Claude…) • StreamingKernel + windowed scoring • GroundTruthStore.add() for easy fact ingestion • Dual licensing: AGPL open + commercial (closed-source/SaaS OK)

Honest AggreFact numbers inside (66.2% balanced acc with streaming enabled). Not claiming SOTA on static NLI — the value is in the live gating + custom KB system.

Repo + full examples: https://github.com/anulum/director-ai

Would love feedback on the scoring weights, halt logic, or kernel design. What hallucination problems are you solving today?

Similar Projects

AI/ML●●●Banger

Detecting LLM hallucinations in <1ms using hidden states (RTX3050, 4GB)

Detects hallucinations via hidden state geometry in under 1ms with no training required.

WizardryBig BrainDark Horse

yubainu

113mo ago

AI/ML●●Solid

DocForge – Multi-Agent RAG That Fact-Checks Its Own Answers

Multi-agent fact-checking loop, but RAG hallucination fixes are table stakes now.

Big BrainShip It

toheed11

114mo ago

AI/ML●Mid

A calculator to expose the hidden infrastructural costs behind RAG

Breaks down hidden RAG costs like vector storage overhead and HNSW indexing fees.

Solve My Problem

abarth23

202mo ago

AI/ML●●Solid

Detect LLM hallucinations via geometric drift (0.9 AUC, 1% overhead)

Detects hallucinations via latent space geometry instead of text analysis, but 54% detection rate is incomplete.

Big BrainWizardryShip It

yubainu

113mo ago

AI/ML●●Solid

5-translation RAG matrix fixing LLM religious hallucinations

Parallel translation comparison beats single-source RAG for theological accuracy.

Niche GemBig Brain

uk9854321

302mo ago

AI/ML●Mid

A marketplace for LLM-powered webapps earning on token margins

Crowded AI marketplace model overshadows the genuinely clever AST-based code editing technique.

Bold BetShip It

cryptoz

101mo ago