OculOS – Any desktop app as a JSON API via OS accessibility tree
Accessibility tree → REST+MCP means Claude controls Spotify with zero instrumentation.
If it's on the screen, it's an API. Control any desktop app via REST + MCP. Rust.
Accessibility tree API beats OCR—Claude controls Spotify without screenshots.
AI agent builders, automation engineers, Claude/Cursor power users
UIAutomator (Android) · Accessibility Inspector (Apple) · PyAutoGUI with accessibility APIs
I built OculOS because giving AI agents (like Claude Code or Cursor) control over desktop apps is still surprisingly difficult. Most current solutions rely on slow OCR/Vision or fragile pixel coordinates.
OculOS is a lightweight daemon written in Rust that reads the OS accessibility tree and exposes every button, text field, and menu item as a structured JSON API and MCP server.
Why this is different:
Semantic Control: No screenshots or coordinates. The agent interacts with actual UI elements (e.g., "Click the 'Play' button").
Rust-powered: Single binary, zero dependencies, and extremely low latency.
Universal: Supports Windows (UIA), macOS (AXUIElement), and Linux (AT-SPI2).
Local & Private: Everything runs on your machine; no UI data is sent to the cloud.
It also includes a built-in dashboard for element inspection and an automation recorder. I’m looking forward to your feedback and technical questions!
Accessibility tree → REST+MCP means Claude controls Spotify with zero instrumentation.
VLCs over VLAs: LLMs write Python code against live robots instead of predicting actions.
Animated cassette tape UI plugs directly into tmux and Claude Code status lines.
Headless PTY daemon lets AI agents control ncurses apps like k9s without a GUI.
AI agents controlling real iPhones through macOS mirroring—genuinely inventive constraint workaround.
Finally, an MCP server that uses your actual cookies instead of spawning headless browsers.