δ-mem: A Compact Online Memory Mechanism for Efficient Long-Context LLM Processing
By
44za12
A weekday bagel. Dependable, satisfying, no fuss.
Summary
The article presents δ-mem, a lightweight memory mechanism for large language models that augments frozen full-attention backbones with a compact online state of associative memory. It compresses past information into a fixed-size state matrix updated by delta-rule learning, generating low-rank corrections to attention computation. With only an 8×8 memory state, δ-mem achieves 1.10× the average score of the frozen backbone and 1.15× that of the strongest non-δ-mem baseline, with larger gains on memory-heavy benchmarks (1.31× on MemoryAgentBench, 1.20× on LoCoMo), all without full fine-tuning or explicit context extension.
Key quotes
· 5 pulledWe propose $\delta$-mem, a lightweight memory mechanism that augments a frozen full-attention backbone with a compact online state of associative memory.
$\delta$-mem compresses past information into a fixed-size state matrix updated by delta-rule learning, and uses its readout to generate low-rank corrections to the backbone's attention computation during generation.
With only an $8\times8$ online memory state, $\delta$-mem improves the average score to $1.10\times$ that of the frozen backbone and $1.15\times$ that of the strongest non-$\delta$-mem memory baseline.
It achieves larger gains on memory-heavy benchmarks, reaching $1.31\times$ on MemoryAgentBench and $1.20\times$ on LoCoMo, while largely preserving general capabilities.
These results show that effective memory can be realized through a compact online state directly coupled with attention computation, without full fine-tuning, backbone replacement, or explicit context extension.
You might also wanna read
DeltaMemory: A Persistent Memory Layer for AI Agents That Learns Over Time
DeltaMemory is a new AI memory layer designed to solve the problem of AI agents forgetting information between sessions. Unlike vector datab
Parametric Memory Law: A Quantitative Framework for Understanding LoRA Memory Capacity in LLMs
This research paper introduces the Parametric Memory Law, a quantitative framework for understanding how Low-Rank Adaptation (LoRA) enables
MemoAttack: A Memory-Driven Framework for Automated LLM Jailbreak Attacks
This paper introduces MemoAttack, a novel memory-driven black-box jailbreak framework for large language models (LLMs). Unlike existing meth
Memori launches agent-native persistent memory infrastructure using structured knowledge graphs from agent trace data
Memori is a new agent-native memory infrastructure that enables AI agents to create structured, long-term persistent memory directly from ag
Mnexium offers persistent, model-agnostic memory layer for AI agent applications
Mnexium is a persistent memory layer for LLM applications that provides chat history, semantic recall, and user profiles across sessions and
AgentMemory: Open-source persistent memory tool for AI coding agents
AgentMemory is an open-source tool that gives AI coding agents (like Claude Code, Codex, Cursor, etc.) persistent memory across sessions, so
