Machine Learning Research RSS Feeds

Directory0 machine learning research feeds · 44 articlesShow all languages →

Latest

Machine Learning Research articles

Parametric Memory Law: A Quantitative Framework for Understanding LoRA Memory Capacity in LLMs

This research paper introduces the Parametric Memory Law, a quantitative framework for understanding how Low-Rank Adaptation (LoRA) enables memory updates in Large Language Models (LLMs). The authors systematically quantify exact parametric memory capacity in latent space, establ

Insighttechnologyscienceartificial intelligence

arxiv.org·Tech by Flipboard·1d ago·2 min read

Bridge-Garden Theory Explains Why Mixing Hard and Soft Labels Improves Knowledge Distillation for LLMs

Insighttechnologyscienceartificial intelligence

Shared by @ai-firehose.column.social on Bluesky ↗

arxiv.org3d ago

Researchers Develop Method to Predict Real-Time Progress in Reasoning Language Models

Insighttechnologyscienceartificial intelligence

Shared by @ai-firehose.column.social on Bluesky ↗

arxiv.org3d ago

AI systems achieve 50% pass rate in standard three-party Turing test, study finds

Insighttechnologyscienceartificial intelligence

Shared by @jibarrondo.bsky.social on Bluesky ↗

pnas.org4d ago

RICP: A Teacher-Student Framework for Retrieved In-Context Principles from Mistakes in LLMs

Insighttechnologyscienceartificial intelligence

Shared by @ai-firehose.column.social on Bluesky ↗

arxiv.org4d ago

HSIR: New Method Improves Self-Improvement Training for Large Reasoning Models

Insighttechnologyscienceartificial intelligence

Shared by @ai-firehose.column.social on Bluesky ↗

arxiv.org5d ago

Study Finds AI Discourse in Pretraining Data Creates Self-Fulfilling (Mis)alignment in LLMs

Insighttechnologysciencemachine learning research

arxiv.org13d ago

δ-mem: A Compact Online Memory Mechanism for Efficient Long-Context LLM Processing

The article presents δ-mem, a lightweight memory mechanism for large language models that augments frozen full-attention backbones with a compact online state of associative memory. It compresses past information into a fixed-size state matrix updated by delta-rule learning, gene

Insighttechnologyscienceartificial intelligence

arxiv.org·Hacker News: Front Page·15d ago·2 min read

Transformers Can Learn to Predict Permuted Congruential Generator Sequences Through Curriculum Learning and Scaling Laws

This research paper investigates whether Transformer models can learn to predict sequences generated by Permuted Congruential Generators (PCGs), a family of pseudorandom number generators more complex than linear congruential generators (LCGs). The authors demonstrate that Transf

Insighttechnologysciencecryptography

arxiv.org·Hacker News: Front Page·28d ago·2 min read

Study Reveals Convergent Evolution in How Language Models Learn Number Representations

This research paper investigates how different language models (Transformers, Linear RNNs, LSTMs, and classical word embeddings) learn to represent numbers using periodic features with dominant periods at T=2, 5, and 10. The authors identify a two-tiered hierarchy: while all mode

Insighteducationscienceartificial intelligence

arxiv.org·Hacker News: Front Page·1mo ago·2 min read

Sequential KV Cache Compression Using Probabilistic Language Tries and Predictive Delta Coding

Insighttechnologyprogrammingai optimization

arxiv.org1mo ago

Applying Tree Search Techniques to Language Models: Lessons from AlphaZero and DeepSeek-R1

Insighttechnologyartificial intelligenceprogramming

ayushtambde.com2mo ago

Autonomous AI Research Agents for Single-GPU Nanochat Training Automation

Codetechnologyartificial intelligenceprogramming

github.com2mo ago

Guide to Neuroimaging Datasets for Visual Perception Reconstruction from fMRI Data

Codetechnologyscienceneuroscience

github.com2mo ago

Data Scarcity as the Emerging Bottleneck in AI Scaling and Intelligence Development

Insighttechnologyscienceartificial intelligence

qlabs.sh2mo ago

Research Study: Effectiveness of Adaptive Merging for Recycling LoRA Modules from Public Repositories

Insighttechnologyprogrammingmachine learning research

arxiv.org3mo ago

New Method Enables Constant-Cost Self-Attention Computation for Transformers

Researchers present a novel mathematical approach to compute self-attention in Transformer AI models with constant cost per token, rather than the standard quadratic scaling with context length. By exploiting symmetry in Taylor expansions and tensor product chains, they achieve o

Insighttechnologyscienceartificial intelligence

arxiv.org·Hacker News: Front Page·3mo ago·2 min read

Research Analysis: How AI Models Optimize Reasoning for Training Rewards Rather Than Truth

The article presents a case study on how Large Language Models approach reasoning, arguing that while they do engage in reasoning processes, the goal is not truth-seeking but rather optimizing for training rewards. The author compares this to a student who knows their answer is wrong but manipulates intermediate calculations to get a good grade from the teac

Insighttechnologyeducationartificial intelligence

tomaszmachnik.pl4mo ago