I challenged an LLM to find a hidden problem in my telemetry data [video]

Name: I challenged an LLM to find a hidden problem in my telemetry data [video]
Availability: InStock
Author: rabidpraxis

by rabidpraxis·Feb 25, 2026·2 points·0 comments

Visit Project View on HN

AI Analysis

●●SolidBig BrainSolve My Problem

Claude finds hidden performance bugs in telemetry via natural language — but it's an MCP wrapper.

Strengths

•Context injection (session_id + intent events) makes end-to-end correlation automatic, reducing instrumentation boilerplate.
•Real incident demo with specific repro: Braintree + MX card + EU region — shows concrete value over generic 'AI investigates data' narrative.
•MCP integration means Claude can query, segment, and dashboard without leaving the chat — genuine workflow improvement.

Weaknesses

•Core insight is 'instrumentation + natural language querying' — Datadog, Sumo Logic, New Relic already offer AI incident analysis at scale.
•Demo is a contrived hidden bug in a toy 5-line Rails app; unclear if this generalizes to messy real-world telemetry or competes with established APM platforms.

Post Description

I hid a performance bug in a Rails cart’s telemetry (no errors, global latency looked fine) and challenged an LLM to find it just by querying the data. It did, then built a dashboard + alert through an MCP server we built at Honeybadger.

I instrumented a tiny Rails shopping cart (5 lines). Insights auto-captures request/controller + ActiveRecord events, and I added two bits of business context: a session_id injected into every event so checkout activity correlates end-to-end, plus a single intent event that records region, cart_total, payment_gateway, and card_type.

The hidden issue: checkout is slow only for Braintree when card_type=MX and region=EU. No errors, uptime green, overall latency looks fine.

In the 6-min video I give the model a vague prompt (“EU customers report slow checkout”). It segments until it finds the outlier, infers abandonment as GET without POST per session_id, estimates impact (~$69 in the current slice, ~$1.2k over a week, rough), then creates a dashboard + alert via the MCP server.

Happy to discuss the MCP architecture, the query language, and what surprised me vs. fell flat.