δ-mem: A Compact Online Memory Mechanism for Efficient Long-Context LLM Processing

44za12

15d ago· 2 min readenInsight

70/100

Toasty

Bagelometer↗

A weekday bagel. Dependable, satisfying, no fuss.

Score70TypeanalysisSentimentpositive

Summary

The article presents δ-mem, a lightweight memory mechanism for large language models that augments frozen full-attention backbones with a compact online state of associative memory. It compresses past information into a fixed-size state matrix updated by delta-rule learning, generating low-rank corrections to attention computation. With only an 8×8 memory state, δ-mem achieves 1.10× the average score of the frozen backbone and 1.15× that of the strongest non-δ-mem baseline, with larger gains on memory-heavy benchmarks (1.31× on MemoryAgentBench, 1.20× on LoCoMo), all without full fine-tuning or explicit context extension.

Key quotes

· 5 pulled

We propose $\delta$-mem, a lightweight memory mechanism that augments a frozen full-attention backbone with a compact online state of associative memory.

$\delta$-mem compresses past information into a fixed-size state matrix updated by delta-rule learning, and uses its readout to generate low-rank corrections to the backbone's attention computation during generation.

With only an $8\times8$ online memory state, $\delta$-mem improves the average score to $1.10\times$ that of the frozen backbone and $1.15\times$ that of the strongest non-$\delta$-mem memory baseline.

It achieves larger gains on memory-heavy benchmarks, reaching $1.31\times$ on MemoryAgentBench and $1.20\times$ on LoCoMo, while largely preserving general capabilities.

These results show that effective memory can be realized through a compact online state directly coupled with attention computation, without full fine-tuning, backbone replacement, or explicit context extension.

Snippet from the RSS feed

Large language models increasingly need to accumulate and reuse historical information in long-term assistants and agent systems. Simply expanding the context window is costly and often fails to ensure effective context utilization. We propose $δ$-mem, a

You might also wanna read

DeltaMemory: A Persistent Memory Layer for AI Agents That Learns Over Time

DeltaMemory is a new AI memory layer designed to solve the problem of AI agents forgetting information between sessions. Unlike vector datab

Product Hunt·3mo ago

Parametric Memory Law: A Quantitative Framework for Understanding LoRA Memory Capacity in LLMs

This research paper introduces the Parametric Memory Law, a quantitative framework for understanding how Low-Rank Adaptation (LoRA) enables

arxiv.org·1d ago

MemoAttack: A Memory-Driven Framework for Automated LLM Jailbreak Attacks

This paper introduces MemoAttack, a novel memory-driven black-box jailbreak framework for large language models (LLMs). Unlike existing meth

arxiv.org·1d ago

Memori launches agent-native persistent memory infrastructure using structured knowledge graphs from agent trace data

Memori is a new agent-native memory infrastructure that enables AI agents to create structured, long-term persistent memory directly from ag

Product Hunt·23d ago

Mnexium offers persistent, model-agnostic memory layer for AI agent applications

Mnexium is a persistent memory layer for LLM applications that provides chat history, semantic recall, and user profiles across sessions and

Product Hunt·4mo ago

AgentMemory: Open-source persistent memory tool for AI coding agents

AgentMemory is an open-source tool that gives AI coding agents (like Claude Code, Codex, Cursor, etc.) persistent memory across sessions, so

Product Hunt·20d ago