Back to browse
GitHub Repository

Gives coding agents eyes for frontend work — visual QA and verification powered by Yutori Navigator.

7 starsPython

Frontend-VisualQA — give coding agents eyes to verify their own UI work

by dhruvbatra·Apr 7, 2026·10 points·0 comments

AI Analysis

●●●BangerBig BrainSolve My ProblemShip It

Vision models catch UI bugs that Playwright selectors miss — built for AI agent workflows.

Strengths
  • Self-correcting navigation recovers when agents land on wrong pages automatically
  • Catches visual-DOM disagreements like progress bars that don't match their labels
  • MCP server integration works directly with Claude Code and Codex agents
Weaknesses
  • Depends on Yutori n1 API rather than fully open local inference
  • Niche audience — only valuable if you're heavily using AI coding agents
Target Audience

Developers using AI coding agents, frontend engineers, QA teams

Similar To

Playwright · Percy · Applitools

Post Description

Coding agents today are blind.

They write “valid” HTML/CSS code but can still ship a broken layout, a clipped dropdown, or a page at the wrong URL. Playwright scripts can assert modal.isVisible() without knowing the modal is rendered off-screen.

Essentially, coding agents need “eyes” to verify their own UI work.

frontend-visualqa is a CLI + MCP server for Claude Code and Codex for visual testing, verification, and QA of a website.

You give it a URL and natural-language claims:

frontend-visualqa verify http://localhost:8000/dashboard.html \ --claims \ 'The API status indicator shows Active' \ 'The monthly quota progress bar is completely filled'

# → first claim passes, second fails (label says 100% but bar is ~65% full)

It catches visual<->DOM disagreements that selectors are blind to.

You can also test interactive flows without hardcoded data:

frontend-visualqa verify 'http://localhost:8000/booking_form.html' \ --claims 'The date on the confirmation page matches the date selected on the calendar' \ --navigation-hint "Fill out the form with example data"

# → fails: fills the form, picks a date, books the slot, and catches an off-by-one date error on the confirmation page

The visual evaluation runs on n1, a VLM by Yutori that is post-trained specifically for browser interaction with RL on live websites. It navigates pages autonomously — so when a coding agent sends it to the wrong URL, n1 sees the wrong page, self-corrects, and reports this correction. On browser-use benchmarks n1 slightly outperforms Opus 4.6 and GPT-5.4 while running 2—3x faster at 4—5x lower cost: https://yutori.com/blog/introducing-n1

How does this compare to?

1. Playwright CLI+MCP - Gold standard, but blind. - frontend-visualqa is the visual verification layer on top.

2. OpenAI Playwright skill / Claude + Dev-Browser - similar idea, but n1 is specifically trained for browser use (thus faster and cheaper), and the claim-based approach structures what to check rather than hoping the model notices everything. - Not locked to a TUI or IDE.

Known limitations: - Native <select> dropdowns render as OS-level widgets outside the viewport — n1 can't see or interact with them. Custom dropdowns work fine. - Small visual/numeric disagreements (red vs green status dot) are a known hard case. Improving with model updates.

Requires a Yutori API key (new accounts get free credits). DM me if you run out of credits.

Similar Projects