How-to-train-your-GPT. Every line commented
Explains attention mechanisms to five-year-olds while building LLaMA 3 from scratch.
Build a modern LLM from scratch. Every line commented. Explained like we are five.
Build a LLaMA-style model from scratch with zero ML prerequisites or math.
Python developers, students, engineers wanting to understand Transformers
nanoGPT · The Elements of Computing Systems · Fast.ai
Explains attention mechanisms to five-year-olds while building LLaMA 3 from scratch.
Karpathy's microGPT in the browser with live loss curves, but pedagogical only—no production value.
Three.js renders real GPT-2 attention patterns you can actually explore interactively.
Gamified AI education beats textbooks, but concept-driven learning exists elsewhere.
Interactive LLM explainer covering tokenization through KV cache across 15 chapters.
Beats GPT-5 on calibration via GRPO with auto-labeled news data.