simple_ans – Asymmetric Numeral Systems Compression in Python/C++

Name: simple_ans – Asymmetric Numeral Systems Compression in Python/C++
Availability: InStock
Author: jmagland

by jmagland·Apr 17, 2026·3 points·1 comment

Visit Project View on HN

AI Analysis

●●SolidNiche GemCozy

Single-file C++ ANS kernel beats wrestling with zstandard for quantized data.

Strengths

•Pure C++ kernel in one file (~few hundred lines) is genuinely simple and auditable
•pybind11 bindings make high-performance compression accessible from Python
•Targets specific niche: quantized numerical data where ANS excels over general compressors

Weaknesses

•Limited to 2-5000 distinct values — significant constraint for many datasets
•Compression space already crowded with zlib, zstandard, lzma, blosc2

Post Description

The Asymmetric Numeral Systems (ANS) algorithm (Duda et al, 2015) is perhaps the most practical way of getting near optimal compression ratios for independent and identically distributed random sequences of symbols from a known discrete probability distribution. Simplest example: a random sequence of 0’s and 1’s with probability p of getting a 1. Shannon’s entropy formula gives us the expected compression ratio for such a sequence, but realizing that compression ratio efficiently in a computer program is not such an easy task. ANS does the trick and is incorporated into several general purpose compression algorithms, but I wasn’t able to track down a simple, self-contained implementation that was reasonably performant.

So I made simple_ans, a straightforward Python package that uses a small, yet efficient, kernel of C++ code (few hundred lines of code).

If you want it even simpler, there’s also a pure Python implementation in the repo (much slower though).

I hope you find it interesting and/or useful!