A deterministic ecosystem simulator for long-horizon AI agents

Name: A deterministic ecosystem simulator for long-horizon AI agents
Availability: InStock
Author: yangkecoy

by yangkecoy·Mar 6, 2026·1 point·0 comments

Visit Project View on HN

AI Analysis

●●●BangerBig BrainWizardryZero to One

Deterministic multi-agent evolutionary benchmark with SHA-256 reproducible capsules for agent testing.

Strengths

•Deterministic replay via SHA-256 capsules—agents are hashable, shareable, and fully reproducible across seeds
•8 frozen scenario suites isolate distinct selection pressures (collapse, scarcity, mutation, entropy)—rigorous benchmark design
•Novel research finding: mutation-intensity crossover where Gray encoding loses locality under scaling—publishable insight

Weaknesses

•Early-stage platform (v1.0)—limited public agent test data and no comparison to standard benchmarks like ALE or Atari
•Niche audience: AI researcher/RL specialist only—not actionable for most agent builders

Post Description

Ten years ago I built a small evolutionary toy experiment with two types of agents: selfish and cooperative “ducks”.

At first, selfish strategies dominated. But when agents were given memory — the ability to remember who helped them — cooperation suddenly became stable under resource scarcity.

That experiment stayed in the back of my mind for years.

Recently I started rebuilding the idea from scratch as a larger system:

BiomeSyn

https://biomesyn.com/

Instead of evaluating AI on static tasks, the goal is to explore long-horizon adaptive environments where agents must:

• gather resources • survive environmental pressure • compete with other agents • adapt over many generations

The system is deterministic, so experiments can be reproduced across seeds — which makes it possible to treat it as a benchmark for adaptive agents.

The bigger question I’m interested in:

> What happens when intelligence is evaluated inside a world that keeps evolving?

Many current benchmarks measure short-episode performance. But real adaptive systems must operate in open-ended environments.

BiomeSyn is still an early research sandbox, but I’m curious whether environments like this could become useful for studying:

• evolutionary computation • long-horizon RL agents • multi-agent ecosystems • adaptive AI systems

Would be interested to hear thoughts from people working on agents, simulation platforms, or large-scale AI systems.

https://biomesyn.com/

Similar Projects

AI/ML●Mid

CivBench a long-horizon AI benchmark for multi-agent games

Civilization matches expose model divergence that static benchmarks miss—but it's a spectacle, not a measurement.

Rabbit HoleBig Brain

mbh159

12243mo ago

Developer Tools●●Solid

Lazarus, a coding agent for long-horizon tasks

Persistent Python runtime keeps state alive across tool calls, unlike Claude Code's stateless tools.

Big BrainNiche Gem

Sai_Praneeth

1014d ago

Developer Tools●●●Banger

Tracecore: Benchmark AI Agents on Deterministic Coding Tasks

Deterministic agent benchmarking with strict validation—unlike SWE-Bench, measures whether agents actually operate.

Solve My ProblemWizardryNiche Gem

extra_cookin

103mo ago

AI/ML●Mid

Self-managing codebase with long-horizon agents

Multi-agent orchestration demo, but bootstrapping still requires humans—Cognition Labs did this first.

Bold BetShip It

wrftaylor

2124d ago

AI/ML●●Solid

Hivecrew: Native macOS app for parallel long-horizon Omni agents

VM isolation beats Docker for agent safety, but macOS virtualization overhead is real.

Big BrainShip It

johnbean393

322mo ago

AI/ML●●●Banger

Benchmarking Tangible Interface Understanding in Long-Horizon Tasks

First benchmark testing if AI agents can actually flip light switches and read appliance panels.

Big BrainNiche Gem

tellarin

111mo ago