GitHub Repository

Gives coding agents eyes for frontend work — visual QA and verification powered by Yutori Navigator.

7 starsPython

Frontend-VisualQA — give coding agents eyes to verify their own UI work

Name: Frontend-VisualQA — give coding agents eyes to verify their own UI work
Availability: InStock
Author: dhruvbatra

by dhruvbatra·Apr 7, 2026·10 points·0 comments

Visit Project View on HN

AI Analysis

●●●BangerBig BrainSolve My ProblemShip It

Vision models catch UI bugs that Playwright selectors miss — built for AI agent workflows.

Strengths

•Self-correcting navigation recovers when agents land on wrong pages automatically
•Catches visual-DOM disagreements like progress bars that don't match their labels
•MCP server integration works directly with Claude Code and Codex agents

Weaknesses

•Depends on Yutori n1 API rather than fully open local inference
•Niche audience — only valuable if you're heavily using AI coding agents

Post Description

Coding agents today are blind.

They write “valid” HTML/CSS code but can still ship a broken layout, a clipped dropdown, or a page at the wrong URL. Playwright scripts can assert modal.isVisible() without knowing the modal is rendered off-screen.

Essentially, coding agents need “eyes” to verify their own UI work.

frontend-visualqa is a CLI + MCP server for Claude Code and Codex for visual testing, verification, and QA of a website.

You give it a URL and natural-language claims:

frontend-visualqa verify http://localhost:8000/dashboard.html \ --claims \ 'The API status indicator shows Active' \ 'The monthly quota progress bar is completely filled'

# → first claim passes, second fails (label says 100% but bar is ~65% full)

It catches visual<->DOM disagreements that selectors are blind to.

You can also test interactive flows without hardcoded data:

frontend-visualqa verify 'http://localhost:8000/booking_form.html' \ --claims 'The date on the confirmation page matches the date selected on the calendar' \ --navigation-hint "Fill out the form with example data"

# → fails: fills the form, picks a date, books the slot, and catches an off-by-one date error on the confirmation page

The visual evaluation runs on n1, a VLM by Yutori that is post-trained specifically for browser interaction with RL on live websites. It navigates pages autonomously — so when a coding agent sends it to the wrong URL, n1 sees the wrong page, self-corrects, and reports this correction. On browser-use benchmarks n1 slightly outperforms Opus 4.6 and GPT-5.4 while running 2—3x faster at 4—5x lower cost: https://yutori.com/blog/introducing-n1

How does this compare to?

1. Playwright CLI+MCP - Gold standard, but blind. - frontend-visualqa is the visual verification layer on top.

2. OpenAI Playwright skill / Claude + Dev-Browser - similar idea, but n1 is specifically trained for browser use (thus faster and cheaper), and the claim-based approach structures what to check rather than hoping the model notices everything. - Not locked to a TUI or IDE.

Known limitations: - Native <select> dropdowns render as OS-level widgets outside the viewport — n1 can't see or interact with them. Custom dropdowns work fine. - Small visual/numeric disagreements (red vs green status dot) are a known hard case. Improving with model updates.

Requires a Yutori API key (new accounts get free credits). DM me if you run out of credits.