Back to browse
GitHub Repository

Experimental uncertainty quantification plugin for agent frameworks.

3 starsPython

AgentUQ, a token-logprob runtime gate for LLM agents

by AntoineN2·Mar 10, 2026·1 point·0 comments

AI Analysis

●●●BangerBig BrainShip ItSolve My Problem

Skips heavy judge loops by using logprobs to gate agent actions at runtime.

Strengths
  • Localizes risk to exact spans like SQL clauses instead of scoring whole responses.
  • Single-pass analysis avoids expensive extra LLM calls for verification or judging.
  • Routes specific actions like retry or block based on localized uncertainty signals.
Weaknesses
  • Stable integration limited to OpenAI Responses API; others remain in preview status.
  • Relies on provider logprob availability which varies across different model vendors.
Category
Target Audience

Backend developers building LLM agent workflows

Similar To

LangSmith · Arize Phoenix · Guardrails AI

Post Description

I built this over the weekend because agent systems seem to jump from static guardrails to much heavier judge-style loops, with not much in between.

AgentUQ uses provider logprobs to localize brittle action-bearing spans in an agent step and route actions like continue, retry, verify, ask for confirmation, or block.

The claim is intentionally narrow. It isn’t trying to determine truth. I know uncertainty work can feel theoretical, so I wanted to test whether a smaller operational use of it could be useful in practice.

My bigger belief is that agents need infrastructure that learns from production history (failed runs as well as unconfident ones) instead of just accumulating patches. This is one concrete experiment in that direction.

Similar Projects

AI/ML●●Solid

LocalAgent v0.5.0, a local-first Rust agent runtime

Approval gates and replayable artifacts solve real local agent debugging pain points.

Big BrainNiche Gem
CalvinBuild
203mo ago