I fit a 9-agent LLM pipeline into 1.5GB of RAM on iOS

Name: I fit a 9-agent LLM pipeline into 1.5GB of RAM on iOS
Availability: InStock
Author: TheCosmicStage

by TheCosmicStage·Mar 5, 2026·2 points·0 comments

Visit Project View on HN

AI Analysis

●●●BangerWizardryBig BrainShip It

ExecuTorch compilation + speculative decoding cuts 9-agent LLM to 1.5GB on iOS.

Strengths

•Blackboard pattern decouples multi-agent reasoning without sequential context degradation, solving a real architectural problem.
•Ahead-of-time PyTorch compilation to .pte binaries eliminates wrapper overhead; speculative decoding gives 2.2-3.6x speedup measured rigorously.
•Tiered model strategy (1B/3B/11B) with identical architecture across hardware—thoughtful constraint-driven design balancing capability with device reality.

Weaknesses

•Pre-release tech spec with no live demo, ship date, or user testing—vaporware risk outweighs the architectural innovation.
•Whisper voice input + biometrics promised but incomplete; shipping timeline unclear and missing critical journaling features (export, sync, backup).

Post Description

"Hey HN. I've been building a completely offline AI journal. The biggest hurdle was the memory footprint of running multiple agent personas. I ended up bypassing standard wrappers and using Meta's ExecuTorch to compile the PyTorch graphs ahead-of-time for the Apple Neural Engine, plus 4-bit quantization. Happy to answer any questions about the CoreML backend or managing the 'Blackboard' state object for the agents without killing the battery."

Similar Projects

Developer Tools●●Solid

Composable middleware for LLM inference Optimization Passes

Tower-style middleware stacking for inference guardrails beats bolted-on if-statements.

Big BrainNiche GemShip It

human_hack3r

703mo ago

Health●●Solid

Odozi – open-source iOS journaling app

Correlates mood against Screen Time and HealthKit data automatically on device.

CozyNiche Gem

jlarks32

601mo ago

Health●Mid

SweatDiary – simple workout journal, native iOS and macOS app

Free native workout diary with iCloud sync, but Strava and Hevy already dominate.

CozyEye Candy

frooto443

112mo ago

AI/ML●●Solid

LLM-use – cost-effective LLM orchestrator for agents

Smart local‑first routing that only escalates to expensive cloud planners when necessary is the standout idea — combined with per‑run cost accounting and full Ollama offline support it solves a real operational itch. The repo is a pragmatic, CLI/TUI-focused toolkit (scraping + cache, MCP server mode) that feels useful for teams wanting a no‑friction orchestrator, but it’s playing in a crowded space of agent frameworks so the novelty is incremental rather than revolutionary.

Niche GemBig Brain

justvugg

214mo ago

Developer Tools●●Solid

Iosef, an iOS simulator CLI designed for agents

Stateless CLI + MCP bridge designed for agents, but already faces incumbent idb and simctl.

Niche GemBig Brain

riwsky

103mo ago

AI/ML●●Solid

Memex – A local-first AI journal that keeps everything as Markdown

Local-first AI journal with multi-agent architecture when most competitors store everything in the cloud.

Dark HorseSolve My Problem

sparkleMing

1020d ago