Train a GPT from scratch in the browser – Karpathy's microGPT

Name: Train a GPT from scratch in the browser – Karpathy's microGPT
Availability: InStock
Author: jayyvk

by jayyvk·Mar 3, 2026·1 point·1 comment

Visit Project View on HN

AI Analysis

●●SolidEye CandyRabbit Hole

Karpathy's microGPT in the browser with live loss curves, but pedagogical only—no production value.

Strengths

•Faithful JavaScript reimplementation of microGPT stays responsive using Web Workers for compute.
•Node-based editor visualizes data flow (dataset → tokenizer → config → training → generation) pedagogically.
•No backend required; all backprop and attention math runs client-side in a constrained, learnable form.

Weaknesses

•Purely educational—5K parameter models cannot produce useful output, defeating post-training feedback loops.
•JavaScript numeric performance bottleneck; slower training than desktop implementations makes exploration frustrating.

Post Description

Faithful reimplementation of Karpathy's microGPT that runs entirely in a Web Worker. You pick a dataset (YC startups, baby names, dinosaurs, or upload your own), configure the architecture, and watch the loss curve drop in real-time as it trains. Then generate text from your model. Everything runs client-side - back propagation, attention, the whole training loop. The fun technical constraint was keeping the UI responsive while doing matrix math in JavaScript. Built it as a node-based editor so you can see the data flow from dataset → tokenizer → config → training → generation.

Github: https://github.com/jayyvk/trainmyowngpt

Similar Projects

Education●●●Banger

How-to-Train-Your-GPT

Build a LLaMA-style model from scratch with zero ML prerequisites or math.

CozyBig Brain

RaiyanYahya

101mo ago

Education●●Solid

How-to-train-your-GPT. Every line commented

Explains attention mechanisms to five-year-olds while building LLaMA 3 from scratch.

CozyNiche Gem

mateenah

401mo ago

AI/ML●●●Banger

MicroGPT-C – C99 GPT for Edge Training and Tiny Model Pipelines

Karpathy's microgpt in C99, proves tiny coordinated models beat single large models on logic.

WizardryBig Brain

Ajay__soni

103mo ago

Education●●Solid

Interactive visualizer for Karpathy's 243-line microGPT

Type a name and you can literally watch characters turn into IDs, 16‑dim embeddings get added with positional encodings, and causal attention matrices animate per head — all matched numerically to Karpathy's 244‑line microGPT. The implementation is pure TypeScript (no ML libs) and includes a helpful scrollable sidebar with the reference math, which makes this an excellent, low‑friction learning tool — more pedagogical deep dive than research innovation.

Rabbit HoleNiche GemEye Candy

Sayyed23

114mo ago

AI/ML●Mid

PicoGPT – GPT in a QR Code

The author minified Karpathy’s MicroGPT, ported it to 39 lines of JS (including a tiny autograd, MHA, AdamW and training loop) and shoehorned the whole gzipped HTML into a version-40 QR code that the browser decompresses and runs. It's clearly a stunt — the model is toy-scale (≈4k params, 8-token context) — but the compression trick, browser-native DecompressionStream use, and runnable-in-QR delivery are a delightful technical flex.

WizardryCrowd Pleaser

kuberwastaken

104mo ago

AI/ML●●●Banger

Andrej Karpathy's microgpt.py to C99 microgpt.c – 4,600x faster

Pure C99 GPT with SIMD beats Python 4,600x; drop two files into any project.

WizardryZero to One

Ajay__soni

4034mo ago