Change multiple parts of an image at once with annotations tool[video]
YouTube tutorial without a product link — can't actually try the tool.
An understudy watches. Then performs.
Intent extraction beats brittle coordinate macros, but desktop agents are getting crowded fast.
Knowledge workers automating repetitive desktop workflows
Cursor · Adept AI · UiPath
Understudy is a local-first desktop agent runtime that can operate GUI apps, browsers, shell tools, files, and messaging in one session. The part I'm most interested in feedback on is teach-by-demonstration: you do a task once, the agent records screen video + semantic events, extracts the intent rather than coordinates, and turns it into a reusable skill.
Demo video: https://www.youtube.com/watch?v=3d5cRGnlb_0
In the demo I teach it: Google Image search -> download a photo -> remove background in Pixelmator Pro -> export -> send via Telegram. Then I ask it to do the same for Elon Musk. The replay isn't a brittle macro: the published skill stores intent steps, route options, and GUI hints only as a fallback. In this example it can also prefer faster routes when they are available instead of repeating every GUI step.
Current state: macOS only. Layers 1-2 are working today; Layers 3-4 are partial and still early.
npm install -g @understudy-ai/understudy understudy wizard
GitHub: https://github.com/understudy-ai/understudyHappy to answer questions about the architecture, teach-by-demonstration, or the limits of the current implementation.
YouTube tutorial without a product link — can't actually try the tool.
Just a YouTube demo of a whiteboard feature with no code or product to try.
Logic gates as stateful bytecode tasks—elegant model, but narrow use case.
Browser automation agent when BrowserUse and MultiOn already exist.
Visualizes the exact four-step path where AI code assistance becomes action authority.
This repo actually wires an OpenCode agent to Membrane so the agent can find existing connectors and synthesize missing ones on the fly — intent becomes action, not just a toy prompt example. It ships a runnable Next.js UI and clear quick-start steps, which makes the idea tangible fast; what I'd like to see next are security notes, more examples of complex connector synthesis, and tests that prove the approach scales beyond demos.