The Logos Machine – AI that distill knowledge into weights during Sleep
Email-gated DocSend deck with grandiose claims and zero shipped code.
A language model that forms persistent memories from conversation and maintains them through sleep. MEMIT weight editing + null-space-constrained maintenance.
Direct weight editing for persistent memory—MEMIT meets LoRA consolidation with null-space math.
ML researchers, LLM engineers, memory system researchers
MEMIT (Baulab) · LoRA fine-tuning · Traditional RAG + vector databases
During wake, facts from conversation are injected directly into MLP weights via MEMIT (a single forward pass, instant recall). During sleep, the system audits which memories degraded, refreshes them with null-space constraints (guaranteeing orthogonality to working memories), then progressively transfers knowledge into LoRA — like biological memory consolidation from hippocampus to neocortex.
The key problem was a hard capacity ceiling: the 8B model sustains 0.92 recall up to 13 facts, then crashes to 0.57 at fact 14 — a sharp phase transition, not gradual decay. And LoRA consolidation was blocked by what I call the "alignment tax": RLHF training fights back against injected knowledge (37% recall loss on 8B from a single LoRA pass).
The fix: per-fact graduated consolidation. Each fact independently tracks its own stage and advances only when LoRA proves it absorbed that specific fact. A dissolution schedule (1.0 → 0.5 → 0.1 → 0.0) gradually removes the MEMIT edit as LoRA takes over. And cumulative fusing — training each cycle on the already-fused model — reduces the alignment tax from catastrophic to negligible (starting loss drops 2.91 → 0.62 by cycle 2).
Results on Llama 3.1 8B (4-bit, 2×H100): - 100% advancement rate at 5/10/15/20 facts - 1.00 chat recall at all scales - MEMIT edits dissolve on schedule, making the buffer renewable - Effective lifetime capacity: unbounded
There's also a biological curiosity: individual facts consolidate at different rates. One synthetic fact ("Aria lives in Portland") is consistently the hardest across very run — some memories are just harder to absorb, same as in biological systems.
6 papers documenting the full journey from initial LoRA prototype to this result: https://doi.org/10.5281/zenodo.18779159
Built with: Python, PyTorch, PEFT, BitsAndBytes, Llama 3.1. Runs on MacBook Air (3B) or H100 (8B/70B).
Email-gated DocSend deck with grandiose claims and zero shipped code.
Zero-initialized overlay changes model beliefs without touching a single base weight.
Tests 13 LLMs on 291 people to reveal what's actually baked into model weights.
Outperforms existing open-source injection detectors on ProtectAI and Qualifire benchmarks.
Neuroscience blog post posing as a research paper—no code, benchmarks, or reproducibility.
Streams LLM weights from CD-ROM during inference to fit 77MB models in 32MB RAM.