Analyzing Training Example Order Effects in Neural Network Gradient Descent
By
pb1729
Crackling crust, pillowy middle. The kind of bagel that earns a second cup of coffee.
Summary
This article explores how the order of training examples affects neural network training via gradient descent, contrary to Bayesian assumptions that training data is unordered. It explains how to compute the effects of swapping training example order on a per-parameter level using Lie brackets, which measure the non-commutativity of gradient updates from different examples. The content provides mathematical analysis of how training example order influences parameter updates in neural networks.
Key quotes
· 4 pulledAn ideal machine learning model would not care what order training examples appeared in its training process.
From a Bayesian perspective, the training dataset is unordered data and all updates based on seeing one additional example should commute with each other.
For neural nets trained by gradient descent, however, this is not the case.
This webpage will explain how to compute the effects of swapping the order of two training examples on a per-parameter level.
You might also wanna read
Flow Maps: Accelerating Diffusion Model Sampling by Learning the Integral Directly
This article explores flow maps as an alternative to iterative sampling in diffusion models. Instead of taking many small steps to denoise a
Mathematical Foundations of High-Dimensional Concept Representation in Language Models
This article explores how large language models like GPT-3 can represent billions of concepts in a relatively small 12,288-dimensional embed
nickyoder.com·8mo agoAI Solves 80-Year-Old Erdős Math Problem in Combinatorial Geometry
An AI system has solved a famous unsolved math problem (an Erdős problem) in combinatorial geometry that stumped mathematicians for 80 years
AI start-ups aggressively recruit mathematicians to advance artificial intelligence research
The article reports on a growing trend of mathematicians leaving academia to join AI start-ups, including both major companies like OpenAI a
AI start-ups aggressively recruit mathematicians to advance artificial intelligence research
The article reports on a growing trend of mathematicians leaving academia to join AI start-ups, including both major companies like OpenAI a
OpenAI's AI model solves 80-year-old Erdős math problem, verified by mathematicians
OpenAI's internal AI model has solved the planar unit distance problem, an 80-year-old math puzzle first posed by Hungarian mathematician Pa
livescience.com·1d ago