An experiment in giving coding agents long-term memory
Persistent memory for coding agents when Cursor and Devin already dominate this space.

Demand-paging memory for agents beats context window limits that break Cursor and Devin.
ML engineers and deep tech teams running iterative experiments
Cursor · Devin · OpenDevin
A real engineering experiment can run for hours. Along the way, the agent reads files, runs commands, checks logs, compares metrics, tries ideas that fail, and needs to remember what already happened. Once context starts slipping, it forgets the goal, loses track of the baseline, and retries bad ideas.
Remoroo is my attempt to solve that problem.
You point it at a repo and give it a measurable goal. It runs locally, tries changes, executes experiments, measures the result, keeps what helps, and throws away what does not.
A big part of the system is memory. Long runs generate far more context than a model can hold, so I built a demand-paging memory system inspired by OS virtual memory to keep the run coherent over time.
There is a technical writeup here: https://www.remoroo.com/blog/how-remoroo-works
Would love feedback from people working on long-running agents, training loops, eval harnesses, or similar workflows.
Persistent memory for coding agents when Cursor and Devin already dominate this space.
Trigger-based cognitive architecture for Claude Code loses context anyway without API-level state persistence.
Red vs Blue AI agents battling over your code beats static scanning.
Actually runs 30 agents in parallel—solved real orchestration pain, not just agent wrapping.
Local multi-agent orchestration that claims autonomy—but execution clarity and real-world viability remain unproven.
Task queue for AI agents, but orchestrates existing tools without novel architecture.