Back to browse
Train a GPT from scratch in the browser – Karpathy's microGPT

Train a GPT from scratch in the browser – Karpathy's microGPT

by jayyvk·Mar 3, 2026·1 point·1 comment

AI Analysis

●●SolidEye CandyRabbit Hole

Karpathy's microGPT in the browser with live loss curves, but pedagogical only—no production value.

Strengths
  • Faithful JavaScript reimplementation of microGPT stays responsive using Web Workers for compute.
  • Node-based editor visualizes data flow (dataset → tokenizer → config → training → generation) pedagogically.
  • No backend required; all backprop and attention math runs client-side in a constrained, learnable form.
Weaknesses
  • Purely educational—5K parameter models cannot produce useful output, defeating post-training feedback loops.
  • JavaScript numeric performance bottleneck; slower training than desktop implementations makes exploration frustrating.
Category
Target Audience

ML learners, educators, developers curious about transformer internals

Similar To

fast.ai · TensorFlow.js examples · Andrej Karpathy's makemore

Post Description

Faithful reimplementation of Karpathy's microGPT that runs entirely in a Web Worker. You pick a dataset (YC startups, baby names, dinosaurs, or upload your own), configure the architecture, and watch the loss curve drop in real-time as it trains. Then generate text from your model. Everything runs client-side - back propagation, attention, the whole training loop. The fun technical constraint was keeping the UI responsive while doing matrix math in JavaScript. Built it as a node-based editor so you can see the data flow from dataset → tokenizer → config → training → generation.

Github: https://github.com/jayyvk/trainmyowngpt

Similar Projects

Education●●●Banger

How-to-Train-Your-GPT

Build a LLaMA-style model from scratch with zero ML prerequisites or math.

CozyBig Brain
RaiyanYahya
101mo ago
Education●●Solid

How-to-train-your-GPT. Every line commented

Explains attention mechanisms to five-year-olds while building LLaMA 3 from scratch.

CozyNiche Gem
mateenah
401mo ago
AI/ML●●●Banger

MicroGPT-C – C99 GPT for Edge Training and Tiny Model Pipelines

Karpathy's microgpt in C99, proves tiny coordinated models beat single large models on logic.

WizardryBig Brain
Ajay__soni
103mo ago
Education●●Solid

Interactive visualizer for Karpathy's 243-line microGPT

Type a name and you can literally watch characters turn into IDs, 16‑dim embeddings get added with positional encodings, and causal attention matrices animate per head — all matched numerically to Karpathy's 244‑line microGPT. The implementation is pure TypeScript (no ML libs) and includes a helpful scrollable sidebar with the reference math, which makes this an excellent, low‑friction learning tool — more pedagogical deep dive than research innovation.

Rabbit HoleNiche GemEye Candy
Sayyed23
114mo ago
AI/MLMid

PicoGPT – GPT in a QR Code

The author minified Karpathy’s MicroGPT, ported it to 39 lines of JS (including a tiny autograd, MHA, AdamW and training loop) and shoehorned the whole gzipped HTML into a version-40 QR code that the browser decompresses and runs. It's clearly a stunt — the model is toy-scale (≈4k params, 8-token context) — but the compression trick, browser-native DecompressionStream use, and runnable-in-QR delivery are a delightful technical flex.

WizardryCrowd Pleaser
kuberwastaken
104mo ago
AI/ML●●●Banger

Andrej Karpathy's microgpt.py to C99 microgpt.c – 4,600x faster

Pure C99 GPT with SIMD beats Python 4,600x; drop two files into any project.

WizardryZero to One
Ajay__soni
4034mo ago