Mimi in the browser – hear the semantic/acoustic split

Name: Mimi in the browser – hear the semantic/acoustic split
Availability: InStock
Author: ymaws

by ymaws·Apr 20, 2026·4 points·1 comment

Visit Project View on HN

AI Analysis

●●SolidRabbit HoleWizardry

Hearing the semantic/acoustic split of neural audio codecs directly in your browser is wild.

Strengths

•Runs full ONNX audio encoding locally in-browser without server roundtrips.
•Interactive codebook toggling reveals how semantic and acoustic data separate.
•Clean UI makes complex transformer architecture intuitively understandable.

Weaknesses

•Tied entirely to Kyutai's Mimi model, not a general-purpose tool.
•Educational demo limits long-term utility beyond initial curiosity.

Similar Projects

AI/ML●●Solid

RAG chunking playground: visualize how your docs get split

Visual chunking comparison beats guessing — export production-ready code.

Solve My ProblemNiche Gem

Horatius77

102mo ago

Productivity●●Solid

Rhizome – semantic backlinks for your notes, generated locally

Writes wikilinks directly into files—no database, no daemon, just markdown.

Big BrainCozyNiche Gem

matzalazar

302mo ago

AI/ML●●●Banger

Decompose – Split text into classified semantic units, no LLM, 14ms

Non-LLM deterministic semantic decomposition—14ms, no hallucination, MCP-ready.

Big BrainSolve My ProblemWizardry

echology-io

104mo ago

AI/ML●●Solid

Instrumental Model from Scratch (With Demo)

The architecture is the project's real showpiece: a 72-band non‑uniform band-split BiMamba U‑Net that uses Mamba scans for O(T) memory and interleaved attention in the bottleneck to mix cross‑frequency context — a clever tradeoff between temporal efficiency and global attention. The author ships a runnable demo and an explanatory write-up so you can reproduce the approach, but it's clearly hobby-scale (≈1k songs trained, single home PC queue, slow cold starts), so expect experimental results rather than SOTA separation or instant throughput.

WizardryNiche Gem

day6

103mo ago

AI/ML●Mid

Search dashcam footage by describing what happened

Gemini video embeddings for dashcam search when Google Photos already indexes footage.

Ship It

sohamrj

603mo ago

Developer Tools●Mid

AI agent coded Python 3.14 interpreter in Rust

AI-coded Python interpreter is impressive, but mostly novelty without real use case.

WizardryBold Bet

blueblazin

203mo ago