Back to browse
GitHub Repository

Agent-native inference engine with O(1) fork latency for tree-structured reasoning

3 starsRust

Dendrite – O(1) KV cache forking for tree-structured LLM inference

by RyeCatcher·Mar 30, 2026·3 points·1 comment

AI Analysis

●●●BangerWizardryBig BrainZero to One

O(1) fork latency makes tree search 1000x faster than vLLM for agentic workloads.

Strengths
  • Copy-on-write block tables enable constant-time branching without KV cache duplication
  • Built-in MCTS and beam search algorithms with UCT scoring out of the box
  • Memory efficiency: 1.1GB vs 6GB for 6-branch exploration with 4K prefix
Weaknesses
  • Zero stars and 5 open issues signals very early stage, unproven in production
  • Only useful for tree-structured inference, not single-sequence chat workloads
Category
Target Audience

ML engineers building agentic systems with tree-of-thought or multi-path reasoning

Similar To

vLLM · SGLang · TGI

Similar Projects

AI/ML●●●Banger

Thaw – Git branch for a running LLM (fork agents, skip prefill)

Git branch for LLM agents — 400x faster forking with preserved KV cache.

WizardryBig BrainSolve My Problem
nilsmatteson
3020d ago