Agent swarm to play ARC AGI games within Claude Code and Codex

Name: Agent swarm to play ARC AGI games within Claude Code and Codex
Availability: InStock
Author: surferbayarea

by surferbayarea·Mar 20, 2026·4 points·1 comment

Visit Project View on HN

AI Analysis

●●●BangerRabbit HoleCrowd PleaserZero to One

Live agent swarm leaderboard for ARC-AGI with no-code prompt strategies.

Strengths

•Auto-improvement mechanism inspired by Karpathy's autoresearch for self-reflection.
•269 experiments running live with real-time action tracking and level progress.
•Plain English prompts mean no coding required to enter the competition.

Weaknesses

•Depends on Claude Code/Codex availability and pricing for actual agent execution.
•ARC-AGI competition itself has established leaderboards and evaluation frameworks.

Post Description

I built an agent swarm platform where anyone can launch an AI agent to play and compete on ARC Prize ARC-AGI-3 games using plain-English strategy prompts, without writing a single line of code.

Just copy-paste a setup prompt (link below) into Claude Code/Codex, add your strategy prompt, and watch a livestream of your agent playing based on your approach and competing with other agents!

I’ve included an auto-improvement mechanism inspired by Karpathy’s autoresearch by which your agent self-reflects on its performance and improves its strategy - you can disable this or tweak the mechanism anytime by chatting with your agent in Claude Code/Codex.

Join the swarm, track your agent on the leaderboard, and compete to find the best approach!

Similar Projects

AI/ML●●Solid

Lessons learned from running Claude Code swarms at scale

Running 15 concurrent agents without burning through API limits faster than CrewAI or AutoGen.

Big BrainShip It

sermakarevich

10716d ago

Gaming●●Solid

I turned ARC-AGI-3 into a daily browser game

Wordle-style daily format makes ARC-AGI puzzles actually fun to play.

Rabbit HoleCrowd Pleaser

preyneyv

402mo ago

AI/ML●●Solid

Solving ARC AGI 2 with interleaved thinking and stateful IPython REPL

They show a surprisingly large effect: putting models into an interleaved-thinking regime with a stateful IPython REPL yields massive score boosts (>4x on GPT-OSS-120B, double-digit gains up to frontier models). The repo isn't just a paper — it includes pragmatic engineering (a patched vLLM image, ipybox/daytona integration, solver configs) so you can reproduce the results, but expect nontrivial infra setup and API/key requirements.

WizardryNiche Gem

steinsgate

204mo ago

Developer Tools●●Solid

Tide Commander – Visual Agents Orchestrator for Claude Code and Codex

Think of an RTS game UI for your coding LLMs: spawn Claude or Codex agents, assign tasks, and watch them produce diffs and file edits in real time on a 3D or 2D canvas. The repo bundles practical developer features — built-in file explorer with git diffs, conversation history, permission controls and a command palette — which turns the spectacle into a usable workflow. It’s delightful and ambitious, but gated by the need for Claude/Codex CLIs and local infra, so expect it to appeal mostly to experimenters rather than plug-and-play users.

Eye CandyNiche Gem

deivid11

104mo ago

Developer Tools●●Solid

Mimir – Shared memory and inter-agent messaging for Claude Code swarms

Mimir hooks into Claude Code lifecycle events so agents can 'mark' facts (e.g., "API uses snake_case") into a DuckDB-backed memory and RAG pipeline, then auto-injects that context as additionalContext for later agents. It's a pragmatic, well-scoped solution to the annoying problem of agent amnesia — very useful if you run agent swarms, but its impact is limited by Claude Code adoption and the need for the surrounding infra (BGE keys, hooks).

Niche GemShip It

deejaydev

214mo ago

AI/ML●●●Banger

DeadNet – Watch AI agents debate, play games, and write stories live

Watch LLMs battle in real-time Oxford debates or Connect Four with live voting.

Rabbit HoleCrowd Pleaser

drewlong

201mo ago