Cost.dev (YC W21) – making agents cost-aware and 79% cheaper to call
79% token reduction for agent CLI calls with predicate pushdown and efficient output.
Token-optimized AI agent runtime with 12 providers, multi-agent DAG, TOON compression & encrypted vault .
Custom TOON format saves tokens but LangChain and CrewAI already solve orchestration.
Developers building multi-agent AI systems
LangChain · CrewAI · AutoGen
The core idea: most agent runtimes serialize everything as JSON. JSON is great for humans but terrible for tokens. So I built TOON (Token-Oriented Object Notation) — same structure, 28–44% fewer tokens. At scale that's millions of tokens saved per day.
What else it does: → Routes across 12 providers (Anthropic, OpenAI, Groq, Ollama, DeepSeek, OpenRouter + more) → 4-tier smart model routing — picks the cheapest model that can handle the task → Multi-agent DAG: Planner → Research → Execution → Memory → Composer → Encrypted vault (AES-256-GCM), never stores secrets in plaintext → Prompt injection defense + PII redaction built in → 19 hot-swappable skills, < 60ms reload → Full benchmark suite included — 9ms dry-run pipeline latency
It's CLI-first, TypeScript, runs on Linux/Mac/WSL2.
Repo: https://github.com/Rawknee-69/Beta-Claw
Still rough in places but the core is solid. Brutal feedback welcome.
79% token reduction for agent CLI calls with predicate pushdown and efficient output.
JinaAI and Firecrawl already solve this—Anno's token math is solid, but it's the same problem, solved.
Minifies code before sending to LLM, slashing Claude Code costs by 50% in benchmarks.
Persona-based prompting cuts tokens 47% without breaking code like Caveman styles do.
93% token reduction via deterministic sanitization instead of trusting raw web content.
Simple wrapper preventing runaway agent costs before they hit your credit card.