Wordchipper – Rust BPE tokenizer, 9x faster than tiktoken
Nine times faster than tiktoken-rs with swappable lexer backends for benchmarking.
An inference architecture that makes LLMs stateful. Patent pending (US 64/050,345).
Injects raw KV tensors directly into model cache to skip 90% of token recomputation.
LLM infrastructure engineers and AI startup CTOs
vLLM · TGI · KV Cache optimizations
Nine times faster than tiktoken-rs with swappable lexer backends for benchmarking.
Cuts long-context costs by 90% by swapping disk IO for expensive GPU recomputation.
Drops token usage 97% and blocks injection — smart sanitization beats raw WebFetch, drop-in replacement.
Ctrl+Shift+S snapshots your whole context, LLM summarizes it, one-click restore.
Interactive LLM explainer covering tokenization through KV cache across 15 chapters.
4x token savings on screenshots with readable text at 800px grey.