Back to browse
Signals – finding the most informative agent traces without LLM judges

Signals – finding the most informative agent traces without LLM judges

by sparacha·Apr 5, 2026·3 points·0 comments

AI Analysis

●●SolidBig BrainNiche Gem

Heuristic signals triage agent traces 1.52x more efficiently than random sampling.

Strengths
  • 82% informativeness rate beats random sampling significantly in their benchmark.
  • Taxonomy covers interaction, execution, and environment patterns without GPU needs.
Weaknesses
  • Research paper only; no shipped library or integration available yet.
  • Agent observability is crowded (LangSmith, Phoenix) with existing sampling features.
Category
Target Audience

LLM application developers, ML engineers building agentic systems

Similar To

LangSmith · Arize Phoenix · Helicone

Post Description

Hey HN

Salman, Shuguang and Adil here from Katanemo Labs (a DigitalOcean company).

Wanted to introduce our latest research on agentic systems called Signals. If you've been building agents, you've probably noticed that there are far too many agent traces/trajectories to review one by one, and using humans or extra LLM calls to inspect all of them gets expensive really fast. The paper proposes a lightweight way to compute structured “signals” from live agent interactions so you can surface the trajectories most worth looking at, without changing the agent’s online behavior. Computing Signals doesn't require a GPU.

Signals are grouped into a simple taxonomy across interaction, execution, and environment patterns, including things like misalignment, stagnation, disengagement, failure, looping, and exhaustion. In an annotation study on τ-bench, signal-based sampling reached an 82% informativeness rate versus 54% for random sampling, which translated to a 1.52x efficiency gain per informative trajectory.

Paper: arXiv 2604.00356. Project where Signals are already implemented: https://github.com/katanemo/plano

Happy to answer questions on the taxonomy, implementation details, or where this breaks down.

Similar Projects