RunAnwhere – Faster AI Inference on Apple Silicon
Custom Metal shaders beat llama.cpp and MLX—1.67x faster on M4 Max.
Bit-exact f64 emulation on Metal GPUs where Apple's native double support is missing.
Graphics programmers and simulation engineers on Apple Silicon
SoftFloat · Cuda Softfloat
Custom Metal shaders beat llama.cpp and MLX—1.67x faster on M4 Max.
Native macOS VMs with APFS snapshots beat Docker for agent isolation.
Autonomous agent wrote custom Metal kernels boosting decode speed 42% over upstream llama.cpp.
MLX-powered local TTS plugin for OpenClaw—elegant but audience is Apple Silicon only.
M3/M4 thermal-manager unlock that most older fan tools don't handle.
Classic treemap UI back on native Apple Silicon, but disk space visualizers already exist.