Thaw – Git branch for a running LLM (fork agents, skip prefill)
Git branch for LLM agents — 400x faster forking with preserved KV cache.
Agent-native inference engine with O(1) fork latency for tree-structured reasoning
O(1) fork latency makes tree search 1000x faster than vLLM for agentic workloads.
ML engineers building agentic systems with tree-of-thought or multi-path reasoning
vLLM · SGLang · TGI
Git branch for LLM agents — 400x faster forking with preserved KV cache.
Routes LLM requests to GPUs with cached KV prefixes, skipping redundant prefill computation.
Tower-style middleware stacking for inference guardrails beats bolted-on if-statements.
Multi-tier caching + tree-sitter indexing, but lacks agent autonomy competitors ship today.
SSD-cached KV blocks dodge re-prefill tax on context shifts—Claude Code now viable locally.
llms.txt tree structure lets agents navigate context instead of dumping everything.