Back to browse
GitHub Repository

Metal-accelerated 3D Gaussian Splatting for Apple Silicon

29 starsMetal

Msplat – 3D Gaussian Splatting training in ~90s on M4 Max, native Metal

by rayanht·Mar 5, 2026·2 points·0 comments

AI Analysis

●●●BangerWizardryZero to OneNiche Gem

3DGS training in 90 seconds on M4 via fused Metal kernels, no PyTorch overhead—unprecedented Apple Silicon story.

Strengths
  • From-scratch Metal pipeline (projection, rasterization, SSIM, Adam) eliminates 2GB PyTorch overhead; genuinely novel constraint-driven architecture
  • GPU-resident densification and tile-local sorting are clever kernel designs; tightly integrated optimization for Apple hardware
  • Native Swift bindings unlock 3DGS in iOS/macOS apps—no existing tool does this; real differentiation for on-device 3D AI
Weaknesses
  • Performance claims (90s on M4 Max vs. TITAN RTX) lack controlled comparison; different hardware and software stacks make speedup unclear
  • Early-stage (9 commits); no published benchmarks, no examples of trained models, no discussion of convergence quality vs. original implementation
Category
Target Audience

ML engineers on Apple Silicon, researchers needing fast 3DGS training, iOS/macOS app developers

Similar To

gsplat · taichi-3dgs · threestudio

Post Description

Hey HN, I built msplat because I wanted to train 3DGS scenes on my Mac without pulling in torch. Most ports I came across go through autograd and hence come with ~2GB of framework overhead, which felt overkill for a pipeline that's just a few dozen GPU kernels + an optimizer.

So I wrote the whole training pipeline from scratch as Metal shaders: projection, tile-based rasterization, SSIM loss, backward pass, Adam, and densification. Everything runs on the GPU

msplat trains 7k iterations of full-resolution Mip-NeRF 360 scenes in ~90s on my M4 Max. In the README I compare against gsplat's published numbers, which were measured on a TITAN RTX. Ofc these are different hardware classes, so take the wall-time comparisons with a grain of salt

Python bindings are on PyPI (pip install msplat), and there are Swift bindings if you want to embed this in a native app. Happy to answer questions about any of the internals

Repo: https://github.com/rayanht/msplat (Apache 2.0)

Similar Projects

SVO Voxelization for Gaussian Splat Collisions

Using an SVO to voxelize Gaussian splats is a sensible way to prune overlap checks — hierarchical voxels fit the problem and should cut costly pairwise collisions. Can't judge the execution: the Reddit thread is blocked with no visible code, benchmarks, or demos, so this currently reads like an intriguing sketch rather than a drop-in tool.

Niche GemWizardry
slimbuck
503mo ago