GitHub Repository

Evaluate structured LLM outputs with precision. Compare model outputs against expected schemas and values — row by row.

4 starsTypeScript

EvalLens – Open-source tool to evaluate structured LLM outputs

Name: EvalLens – Open-source tool to evaluate structured LLM outputs
Availability: InStock
Author: simonrendon

by simonrendon·Apr 6, 2026·1 point·0 comments

Visit Project View on HN

AI Analysis

●●SolidNiche GemShip It

Schema conformance checks beat generic text evals for JSON-heavy LLM pipelines.

Strengths

•Failure taxonomy explains why structured outputs broke instead of just binary pass/fail.
•Self-hosted mode generates actuals via API keys without sending data externally.
•Exports branded PDF reports for sharing regression testing results with stakeholders.

Weaknesses

•Zero GitHub stars suggests very early stage with unproven community traction.
•Hosted version requires uploading sensitive prompt/output data to external servers.

Similar Projects

AI/ML●●●Banger

A new benchmark for testing LLMs for deterministic outputs

Finally separates JSON validity from actual value hallucination in LLM outputs.

Big BrainSolve My Problem

khurdula

60301mo ago

AI/ML●●Solid

Valohai LLM – Track and compare LLM evaluation results in one dashboard

Streams evals from a tiny Python client into a shared dashboard and lets you run parameter sweeps and compare up to six configurations with radar/bar charts and scorecards — exactly the sort of tooling that stops results getting lost in notebooks. Useful, pragmatic product for teams who repeatedly evaluate models, but it's competing with general observability/experiment trackers (W&B, Neptune) and will need strong integrations and metric flexibility to stand out.

Niche GemSolve My Problem

radicain

304mo ago

Developer Tools●●Solid