All Topics

Technology

Art

Complete Educational Implementations of Ilya Sutskever's 30 Foundational Deep Learning Papers

auraham

4mo ago· 10 min readenCode

100/100

Golden Brown

Bagelometer↗

Pulled from the oven just right. Trustworthy, fact-dense, deeply satisfying.

Score100Typehow-toSentimentpositive

Summary

This repository provides comprehensive educational implementations of the 30 foundational deep learning papers recommended by Ilya Sutskever. It offers complete implementations of all 30 papers, which Sutskever claimed would teach "90% of what matters" in deep learning. The project includes detailed notebooks for each paper, covering foundational concepts from early neural networks to modern architectures, with additional versions available for agents and polyglot/multi-backend implementations.

Key quotes

· 5 pulled

This repository contains detailed, educational implementations of the papers from Ilya Sutskever's famous reading list - the collection he told John Carmack would teach you '90% of what matters' in deep learning.

Progress: 30/30 papers (100%) - COMPLETE! 🎉

Each implementation:

Sutskever 30 - Complete Implementation Suite

Comprehensive toy implementations of the 30 foundational papers recommended by Ilya Sutskever

Snippet from the RSS feed

Sutskever 30 implementations inspired by https://papercode.vercel.app/ | For Agents, use https://github.com/pageman/Sutskever-Agent | Polyglot / Multi-Backed version at https://github.com/pageman/s...

You might also wanna read

Build Your Own LLM From Scratch: A Hands-On GPT Training Workshop

A hands-on workshop and GitHub repository that guides users through building their own GPT training pipeline from scratch, inspired by Andre

github.com·27d ago

Netflix Recommendation System Implementation Guide - Demo Codes Repository

A GitHub repository providing a complete implementation of Netflix's recommendation system concepts, covering matrix factorization, regulari

github.com·9mo ago

DeepSeek-V4: Hybrid Sparse-Attention Architecture Enables Efficient Million-Token Context Inference

DeepSeek-V4 introduces a hybrid sparse-attention architecture combined with on-policy distillation across domain specialists, enabling 1M-to

artgor.medium.com·11h ago

Building a Minimal RAG System from Scratch: PDF to Highlighted Answers in ~100 Lines of Python

A hands-on tutorial that builds the smallest functional RAG (Retrieval-Augmented Generation) system from scratch using about 100 lines of Py

towardsdatascience.com·1d ago

Rotary GPU: Enabling Large Mixture-of-Experts Models on Consumer Laptop GPUs with Limited Memory

This paper presents Rotary GPU, an exploratory approach to running large Mixture-of-Experts (MoE) language models on consumer-grade hardware

arxiv.org·1d ago

What pretraining on unlabeled text teaches large language models about language structure

Pretraining on unlabeled text teaches large language models to model the statistical structure of language by optimizing next-token predicti

sebastianraschka.com·1d ago