All Topics

Technology

Art

Analyzing Training Example Order Effects in Neural Network Gradient Descent

pb1729

1mo ago· 8 min readenInsight

100/100

Golden Brown

Bagelometer↗

Crackling crust, pillowy middle. The kind of bagel that earns a second cup of coffee.

Score100TypeanalysisSentimentneutral

Summary

This article explores how the order of training examples affects neural network training via gradient descent, contrary to Bayesian assumptions that training data is unordered. It explains how to compute the effects of swapping training example order on a per-parameter level using Lie brackets, which measure the non-commutativity of gradient updates from different examples. The content provides mathematical analysis of how training example order influences parameter updates in neural networks.

Key quotes

· 4 pulled

An ideal machine learning model would not care what order training examples appeared in its training process.

From a Bayesian perspective, the training dataset is unordered data and all updates based on seeing one additional example should commute with each other.

For neural nets trained by gradient descent, however, this is not the case.

This webpage will explain how to compute the effects of swapping the order of two training examples on a per-parameter level.

Snippet from the RSS feed

skip to results

You might also wanna read

Flow Maps: Accelerating Diffusion Model Sampling by Learning the Integral Directly

This article explores flow maps as an alternative to iterative sampling in diffusion models. Instead of taking many small steps to denoise a

sander.ai·25d ago

Mathematical Foundations of High-Dimensional Concept Representation in Language Models

This article explores how large language models like GPT-3 can represent billions of concepts in a relatively small 12,288-dimensional embed

nickyoder.com·8mo ago

AI Solves 80-Year-Old Erdős Math Problem in Combinatorial Geometry

An AI system has solved a famous unsolved math problem (an Erdős problem) in combinatorial geometry that stumped mathematicians for 80 years

wsj.com·1d ago

AI start-ups aggressively recruit mathematicians to advance artificial intelligence research

The article reports on a growing trend of mathematicians leaving academia to join AI start-ups, including both major companies like OpenAI a

newscientist.com·1d ago

AI start-ups aggressively recruit mathematicians to advance artificial intelligence research

The article reports on a growing trend of mathematicians leaving academia to join AI start-ups, including both major companies like OpenAI a

newscientist.com·1d ago

OpenAI's AI model solves 80-year-old Erdős math problem, verified by mathematicians

OpenAI's internal AI model has solved the planar unit distance problem, an 80-year-old math puzzle first posed by Hungarian mathematician Pa

livescience.com·1d ago