All Topics
All Topics
Technology
Technology
Design
Design
Programming
Programming
Science
Science
News
News
Gaming
Gaming
Entertainment
Entertainment
Business
Business
Finance
Finance
Sports
Sports
Health
Health
Food
Food
Travel
Travel
Art
Art
Music
Music
Books
Books
Education
Education
Politics
Politics
Personal
Personal
No algorithm. No AI slop. No ads. Just RSS. Pro-human. Indie writers. Real journalism. Open web. Chronological. Hand toasted.

Analyzing Training Example Order Effects in Neural Network Gradient Descent

By

pb1729

1mo ago· 8 min readenInsight

Summary

This article explores how the order of training examples affects neural network training via gradient descent, contrary to Bayesian assumptions that training data is unordered. It explains how to compute the effects of swapping training example order on a per-parameter level using Lie brackets, which measure the non-commutativity of gradient updates from different examples. The content provides mathematical analysis of how training example order influences parameter updates in neural networks.

Key quotes

· 4 pulled
An ideal machine learning model would not care what order training examples appeared in its training process.
From a Bayesian perspective, the training dataset is unordered data and all updates based on seeing one additional example should commute with each other.
For neural nets trained by gradient descent, however, this is not the case.
This webpage will explain how to compute the effects of swapping the order of two training examples on a per-parameter level.
Snippet from the RSS feed
skip to results

You might also wanna read