Back to browse
GitHub Repository

Token budget enforcement for AI agents. Hard limits, configurable policy, zero infrastructure required.

3 starsPython

Tokencap – Token budget enforcement across your AI agents

by pykul·Apr 4, 2026·7 points·0 comments

AI Analysis

●●SolidSolve My ProblemBig Brain

Tracks tokens not dollars—clever design that avoids pricing drift headaches.

Strengths
  • In-process enforcement blocks calls before tokens are spent, not after the bill arrives.
  • Token tracking instead of dollar tracking avoids pricing change drift.
  • SQLite default with Redis option scales from single scripts to multi-agent deployments.
Weaknesses
  • AI cost monitoring already has Helicone, LangSmith, and provider-native dashboards.
  • No dashboard or visualization—purely programmatic control may limit adoption.
Target Audience

Developers building AI agents with LangChain, CrewAI, or direct API clients

Similar To

Helicone · LangSmith · Portkey

Post Description

I built this after hitting the same wall repeatedly — no good way to enforce token budgets in application code. Provider caps are account-level and tell you what happened, not what is happening.

Two ways to add it:

# Direct client wrapper client = tokencap.wrap(anthropic.Anthropic(), limit=50_000)

# LangChain, CrewAI, AutoGen, etc. tokencap.patch(limit=50_000)

Four actions at configurable thresholds: WARN, DEGRADE (transparent model swap), BLOCK, and WEBHOOK. SQLite out of the box, Redis for multi-agent setups.

One design decision worth mentioning: tokencap tracks tokens, not dollars. Token counts come directly from the provider response and never drift with pricing changes.

Happy to answer any questions.

Similar Projects

Developer Tools●●●Banger

AgentBudget – Real-time dollar budgets for AI agents

Real-time dollar limits on AI agents, monkey-patched into OpenAI/Anthropic SDKs.

Solve My ProblemDark HorseShip It
sahiljagtapyc
783mo ago
Developer Tools●●Solid

Preventing runaway LLM agents (enforcement layer)

VERONICA puts an enforcement shim between your agent and the model so you can halt costly spirals before a request hits the provider — it natively exposes hard budget enforcement, circuit breakers, retry containment and degradation levels. The README + runnable runaway-loop demo make the failure mode concrete and the API (BudgetEnforcer, RuntimeContext, BudgetExceeded) is small and practical. I'd like to see richer observability/adapter docs for common agent frameworks, but as an enforcement-first primitive this is a clever, useful tool.

Niche GemBig Brain
amabito
124mo ago