JAMEL: A Framework for Joint Memory and Exploration Learning in Language Model Agents

[Submitted on 1 Jun 2026]

17d ago· 2 min readenInsight

Summary

This paper introduces JAMEL (Joint Agent Memory and Exploration Learning), a framework that trains language model agents to explore open-ended environments more effectively by jointly learning memory compression and exploration policies. The key insight is that memory and exploration form a mutually dependent loop: exploration needs memory to distinguish novel from exhausted behaviors, while novelty-seeking interactions provide the supervision needed to train useful memory. The framework uses deterministic novelty signals (like code coverage in GUI domains) as natural, annotation-free supervision. Empirical results show JAMEL generalizes to unseen environments, outperforms open-weight baselines, rivals closed-source models in exploration depth, and reduces token consumption.

Source

bskyJAMEL: A Framework for Joint Memory and Exploration Learning in Language Model Agentsarxiv.org

Key quotes

· 5 pulled

We introduce Joint Agent Memory and Exploration Learning (JAMEL), a framework that trains agentic memory and exploration policy together through novelty-driven interaction.

We observe that memory and exploration form a mutually dependent loop: sustained exploration requires memory to distinguish exhausted behaviors from unseen ones, while novelty-seeking interaction provides the supervision needed to make memory useful for future exploration.

By utilizing deterministic and persistent novelty signals such as code coverage in the GUI domain, we provide natural, annotation-free supervision for the memory module.

Empirical evaluations demonstrate that JAMEL successfully generalizes to unseen environments.

Its exploration capability outperforms open-weight baselines and rivals the exploration depth of a closed-source model while reducing token consumption.

Snippet from the RSS feed

In open-ended environments, exploration is fundamental for autonomous agents, yet current language model agents struggle with this. Effective exploration requires memory, but retaining raw interaction histories is computationally expensive over long traje

You might also wanna read

Ouro: Looped Language Models That Build Reasoning into Pre-Training Through Latent Space Iteration

Researchers introduce Ouro, a family of pre-trained Looped Language Models (LoopLM) that build reasoning capabilities directly into the pre-

arxiv.org·5mo ago

Robot Memory System Enables AI Robots to Learn from Past Experiences

Robot Memory (robotmem) is a persistent memory system for AI robots that enables robots to learn from past experiences. The system stores ep

github.com·3mo ago

AgentGym-RL: A Reinforcement Learning Framework for Training LLM Agents in Multi-Turn Decision Making

This paper introduces AgentGym-RL, a unified reinforcement learning framework for training LLM agents to perform multi-turn interactive deci

arxiv.org·1d ago

AgentGym-RL: A Reinforcement Learning Framework for Training LLM Agents in Multi-Turn Decision Making

This paper introduces AgentGym-RL, a unified reinforcement learning framework for training LLM agents to perform multi-turn interactive deci

arxiv.org·1d ago

Memori launches agent-native persistent memory infrastructure using structured knowledge graphs from agent trace data

Memori is a new agent-native memory infrastructure that enables AI agents to create structured, long-term persistent memory directly from ag

Product Hunt·1mo ago

Reinforcement Learning to Train Large Language Models to Explain Human Decisions

arxiv.org·1y ago

δ-mem: A Compact Online Memory Mechanism for Efficient Long-Context LLM Processing

The article presents δ-mem, a lightweight memory mechanism for large language models that augments frozen full-attention backbones with a co

arxiv.org·1mo ago

Comments

No comments yet. Be the first.