A Recipe for Training Neural Networks
Overview
Andrej Karpathy's classic practitioner guide to training neural networks — covering the common failure modes, debugging strategies, and the right mental model for getting NNs to work reliably.
Full Description
Written by Andrej Karpathy, this article distils years of deep learning experience into practical training wisdom. Karpathy argues that neural network training is not just about running code — it requires a methodical, scientific mindset to diagnose what's actually going wrong. The recipe covers: overfit a single batch first, visualize the loss landscape, check gradient flow, inspect weight distributions, use proper learning rate schedules, and systematically ablate hyperparameters. Despite being written in 2019, it remains the most cited practical guide to debugging and training neural networks in the ML community.