GitHub Repository

Static-allocation MLP inference in ANSI C using 2-slot circular buffer with fixed stride indexing. An easy to use, minimal MLP alternative to GiorgosXou/NeuralNetworks enhanced with PROGMEM, int-quantization etc.

4 starsC

Static-allocation MLP inference in ANSI C using a 2-slot ring buffer

by xou·May 29, 2026·4 points·0 comments

Visit Project View on HN

AI Analysis

●●●BangerNiche GemWizardryBig Brain

Two-slot ring buffer cuts MLP RAM usage to the practical lower bound on microcontrollers.

Strengths

•Header-only ANSI C means zero build dependencies for embedded projects.
•Static allocation eliminates heap fragmentation risks on memory-constrained MCUs.
•Supports int-quantization and PROGMEM for AVR without external libraries.

Weaknesses

•Only supports feedforward MLPs, no CNN or transformer architectures yet.
•Training requires Python script export, no on-device learning capability.

Post Description

I've been experimenting since 2019 with ways to minimize RAM usage for tiny MLP inference on microcontrollers. [0]

This project is the result of that exploration: a fully static-allocation approach to MLP inference in ANSI C, using a simple 2-slot ring buffer to keep memory usage predictable and extremely low, while at the same time fast.

I believe this is close to the practical lower bound for RAM usage in general-purpose CPU MLP inference without sacrificing speed or introducing runtime complexity.

A more aggressive approach I've previously used is allocating and freeing memory per layer-to-layer pair during inference, but that introduces overhead and fragmentation if not used carefully. [1]

Curious how it compares to other minimal inference implementations people have seen (or built). Feedback and edge cases welcome. Hope you like it. Have fun. <3

[0]: https://github.com/GiorgosXou/NeuralNetworks#-research [1]: look for REDUCE_RAM_DELETE_OUTPUTS in the source of [0]