ToolGuard – Pytest for AI agent tool calls
Finally, pytest for AI tool calls when evals only test intelligence.
The "Cloudflare for AI Agents". 7-layer security interceptor, real-time observability dashboard, and automated reliability testing for MCP and AI tool chains. Prevent hallucinations, prompt injection, and destructive tool calls.
Layer 2 execution testing without LLMs when eval frameworks only test intelligence.
AI agent developers building Python tool chains
LangSmith evals · Pytest · Hypothesis
No LLM needed to run tests. It reads your type hints, generates a Pydantic schema, and deterministically breaks things.
pip install py-toolguard
GitHub: https://github.com/Harshit-J004/toolguard
If you are building complex tool chains, I would be incredibly honored if you checked out the repo. Brutal feedback on the architecture is highly encouraged!
Finally, pytest for AI tool calls when evals only test intelligence.
Distilled Gemini tool-calling into a 26M model that runs at 1200 tok/s on phones.
Restricted DSL for AI agents wraps existing functions instead of sandboxing entire runtimes.
No-framework Python agent build that actually runs, not just theory.
PolyMCP turns Python functions into a single Pyodide WASM bundle so agents can call tools directly in the browser or at the edge — neat and practical. It keeps MCP niceties like input validation, error handling, and orchestration inside the bundle and ships runnable demo HTML to prove the flow. Be realistic about Pyodide trade-offs: bundle size and no native-extension support make this best for lightweight, interactive tools and demos rather than heavy backend workloads.
VCR for LLM calls—eliminates API costs and non-determinism in agent testing.