Andrej Karpathy
Andrej Karpathy (born 23 October 1986) is a Slovak-Canadian-American artificial intelligence researcher, educator, and entrepreneur. He is widely recognised for his contributions to computer vision and deep learning, his influential role as Senior Director of AI at Tesla where he led the Autopilot vision team, and his prolific open-source and educational work that has made neural network research accessible to millions. He is the founder of Eureka Labs, an AI-native education company.
Early life and education
Karpathy was born in Bratislava, then part of Czechoslovakia, and moved to Toronto, Canada, at age 15.[1] He received his Bachelor of Science in Computer Science and Physics from the University of Toronto in 2009, where he was exposed to the machine learning research culture surrounding Geoffrey Hinton's group.
He completed a Master of Science at the University of British Columbia in 2011, working on physics-based character animation using reinforcement learning. He then moved to Stanford University, where he earned his PhD in 2015 under the supervision of Fei-Fei Li. His doctoral thesis, Connecting Images and Natural Language, explored models that generate natural language descriptions of images — work that helped establish the field of vision-language modelling.[2]
Career
Stanford and cs231n
While at Stanford, Karpathy created and taught cs231n: Convolutional Neural Networks for Visual Recognition, which became one of the most popular computer science courses in the university's history, with over 700 students enrolled per offering by 2017.[3] The course's freely available lecture videos on YouTube have been viewed millions of times and are widely credited with training a generation of deep learning practitioners. The accompanying course notes became a de facto textbook for learning convolutional neural networks.
OpenAI (2015–2017)
Karpathy was a founding member of OpenAI in December 2015, where he worked as a research scientist. During this period he focused on generative models and deep reinforcement learning. His work at OpenAI included research on learning dexterous in-hand manipulation and reinforcement learning environments.
Tesla (2017–2022)
In June 2017, Karpathy joined Tesla as Senior Director of AI, leading the Autopilot computer vision team. At Tesla, he oversaw the transition from a multi-sensor fusion approach to a pure vision-based system for autonomous driving, arguing that cameras — like human eyes — provide sufficient information for navigation when processed by sufficiently powerful neural networks.
Under Karpathy's leadership, Tesla's Autopilot team:
- Built one of the largest real-world neural network training pipelines, processing petabytes of driving video data from Tesla's fleet
- Developed the "HydraNet" architecture — a multi-task neural network that shared a backbone across dozens of driving-related perception tasks (object detection, lane detection, depth estimation, traffic light recognition)
- Transitioned from hand-labelled datasets to an auto-labelling pipeline that used offline models and multi-camera reconstruction to automatically generate training labels at scale
- Introduced "AI Day" (2021, 2022) — public technical presentations that offered unusual transparency into a production AI system's architecture
Karpathy departed Tesla in July 2022, citing a desire to return to hands-on technical work.[4]
Return to OpenAI (2023)
In February 2023, Karpathy briefly returned to OpenAI, where he contributed to research and education initiatives. He left again in February 2024, stating his intention to focus on personal projects.[5]
Eureka Labs (2024–present)
In July 2024, Karpathy announced the founding of Eureka Labs, an AI-native education company. The venture aims to create a new kind of educational experience in which an AI teaching assistant, guided by course materials designed by expert human instructors, provides personalised tutoring at scale. The first planned course is LLM101n: Let's build a Storyteller, an undergraduate-level course on building a large language model from scratch.[6]
Open-source and educational contributions
Karpathy is one of the most influential AI educators working outside traditional academia. His major open-source projects include:
- char-rnn (2015) — A character-level recurrent neural network for text generation, accompanied by the blog post "The Unreasonable Effectiveness of Recurrent Neural Networks", which became one of the most widely read introductions to RNNs and inspired thousands of hobbyist projects.[7]
- minGPT (2020) — A minimal 300-line PyTorch re-implementation of GPT-2, designed to strip away engineering complexity and expose the core algorithm. The repository became a standard pedagogical reference for understanding transformers.
- nanoGPT (2023) — A successor to minGPT optimised for training speed while retaining simplicity. It can reproduce the GPT-2 (124M) model on a single GPU in approximately 45 minutes. nanoGPT's codebase became the starting point for dozens of research projects and educational tutorials.
- llm.c (2024) — A pure C implementation of GPT-2 training, with no dependency on PyTorch or any deep learning framework. The project demonstrated that LLM training could be expressed in roughly 1,000 lines of C/CUDA and provoked discussion about the complexity overhead of modern ML frameworks.[8]
- build-nanogpt (2024) — A YouTube video series walking through the construction of a GPT from scratch, which received over 3 million views in its first months.
His YouTube channel, launched in earnest in 2023, has accumulated over 1 million subscribers and is widely regarded as the highest-quality free resource for learning about LLMs, tokenisation, and neural network internals.
Influence and recognition
Karpathy's educational approach — building systems from scratch in minimal code, explaining every line — has been widely imitated and has materially shaped how a generation of engineers learns deep learning. His phrase "the hottest new programming language is English" (referring to prompt engineering) gained wide currency in 2023.
He has been cited as one of the most influential voices in AI by Time, MIT Technology Review, and Forbes. His research papers have been cited over 100,000 times according to Google Scholar.
Selected publications
- Karpathy, A.; Toderici, G.; Shetty, S.; Leung, T.; Sukthankar, R.; Fei-Fei, L. (2014). "Large-Scale Video Classification with Convolutional Neural Networks." CVPR 2014.
- Karpathy, A.; Fei-Fei, L. (2015). "Deep Visual-Semantic Alignments for Generating Image Descriptions." CVPR 2015.
- Johnson, J.; Karpathy, A.; Fei-Fei, L. (2016). "DenseCap: Fully Convolutional Localization Networks for Dense Captioning." CVPR 2016.
References
- ↑ Karpathy, Andrej. Personal blog, "About" page.
- ↑ Karpathy, Andrej; Fei-Fei, Li (2015). "Deep Visual-Semantic Alignments for Generating Image Descriptions." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
- ↑ Stanford University. cs231n course page.
- ↑ Karpathy, Andrej (13 July 2022). Announcement on Twitter/X.
- ↑ Karpathy, Andrej (13 February 2024). Announcement on X.
- ↑ Karpathy, Andrej (16 July 2024). "Eureka Labs." Blog post.
- ↑ Karpathy, Andrej (21 May 2015). "The Unreasonable Effectiveness of Recurrent Neural Networks." Blog post.
- ↑ Karpathy, Andrej (2024). "llm.c: LLM training in simple, raw C/CUDA." GitHub repository.