Complete Educational Implementations of Ilya Sutskever's 30 Foundational Deep Learning Papers
By
auraham
Pulled from the oven just right. Trustworthy, fact-dense, deeply satisfying.
Summary
This repository provides comprehensive educational implementations of the 30 foundational deep learning papers recommended by Ilya Sutskever. It offers complete implementations of all 30 papers, which Sutskever claimed would teach "90% of what matters" in deep learning. The project includes detailed notebooks for each paper, covering foundational concepts from early neural networks to modern architectures, with additional versions available for agents and polyglot/multi-backend implementations.
Key quotes
· 5 pulledThis repository contains detailed, educational implementations of the papers from Ilya Sutskever's famous reading list - the collection he told John Carmack would teach you '90% of what matters' in deep learning.
Progress: 30/30 papers (100%) - COMPLETE! 🎉
Each implementation:
Sutskever 30 - Complete Implementation Suite
Comprehensive toy implementations of the 30 foundational papers recommended by Ilya Sutskever
You might also wanna read
Build Your Own LLM From Scratch: A Hands-On GPT Training Workshop
A hands-on workshop and GitHub repository that guides users through building their own GPT training pipeline from scratch, inspired by Andre
Netflix Recommendation System Implementation Guide - Demo Codes Repository
A GitHub repository providing a complete implementation of Netflix's recommendation system concepts, covering matrix factorization, regulari
DeepSeek-V4: Hybrid Sparse-Attention Architecture Enables Efficient Million-Token Context Inference
DeepSeek-V4 introduces a hybrid sparse-attention architecture combined with on-policy distillation across domain specialists, enabling 1M-to
Building a Minimal RAG System from Scratch: PDF to Highlighted Answers in ~100 Lines of Python
A hands-on tutorial that builds the smallest functional RAG (Retrieval-Augmented Generation) system from scratch using about 100 lines of Py
Rotary GPU: Enabling Large Mixture-of-Experts Models on Consumer Laptop GPUs with Limited Memory
This paper presents Rotary GPU, an exploratory approach to running large Mixture-of-Experts (MoE) language models on consumer-grade hardware

What pretraining on unlabeled text teaches large language models about language structure
Pretraining on unlabeled text teaches large language models to model the statistical structure of language by optimizing next-token predicti
