AgentGym-RL: A Reinforcement Learning Framework for Training LLM Agents in Multi-Turn Decision Making

[Submitted on 10 Sep 2025]

1d ago· 3 min readenNews

Summary

This paper introduces AgentGym-RL, a unified reinforcement learning framework for training LLM agents to perform multi-turn interactive decision-making across diverse real-world environments. Unlike existing approaches that rely on supervised fine-tuning (SFT), AgentGym-RL trains agents from scratch using RL. The framework features a modular, decoupled architecture supporting mainstream RL algorithms and a wide variety of scenarios. The authors also propose ScalingInter-RL, a training approach that balances exploration and exploitation by initially restricting interactions (emphasizing exploitation) and gradually expanding horizons (encouraging exploration). Experiments show the framework matches or surpasses commercial models on 27 tasks across diverse environments. The complete framework, including code and datasets, will be open-sourced.

Source

Twitter / XAgentGym-RL: A Reinforcement Learning Framework for Training LLM Agents in Multi-Turn Decision Makingarxiv.org

Key quotes

· 5 pulled

Like human cognitive development, agents are expected to acquire knowledge and skills through exploration and interaction with the environment.

We introduce AgentGym-RL, a new framework to train LLM agents for multi-turn interactive decision-making through RL.

In early stages, it emphasizes exploitation by restricting the number of interactions, and gradually shifts towards exploration with larger horizons to encourage diverse problem-solving strategies.

Our agents match or surpass commercial models on 27 tasks across diverse environments.

We will open-source the complete AgentGym-RL framework — including code and datasets — to empower the research community in developing the next generation of intelligent agents.

Snippet from the RSS feed

Developing autonomous LLM agents capable of making a series of intelligent decisions to solve complex, real-world tasks is a fast-evolving frontier. Like human cognitive development, agents are expected to acquire knowledge and skills through exploration

You might also wanna read

EvoTrainer: A Framework for Co-Evolving LLM Policies and Training Harnesses in Autonomous Agentic Reinforcement Learning

The article introduces EvoTrainer, an autonomous training framework for LLMs that goes beyond traditional recipe search by co-evolving both

arxiv.org·19d ago

Latent Agents: Distilling Multi-Agent Debate into Single LLMs via Post-Training Internalization

This paper introduces "Latent Agents," a post-training framework that distills multi-agent debate into a single LLM through a two-stage fine

arxiv.org·17d ago

Latent Agents: Distilling Multi-Agent Debate into Single LLMs via Post-Training Internalization

This paper introduces "Latent Agents," a post-training framework that distills multi-agent debate into a single LLM through a two-stage fine

arxiv.org·17d ago

Using Curriculum Learning and PufferLib to Train Superhuman AI Agents for 2048 and Tetris

The article describes using PufferLib, a reinforcement learning framework, to train gaming agents that achieve superhuman performance in 204

kywch.github.io·5mo ago

Arbor: A Multi-Agent Framework Using Tree Search for Autonomous LLM Inference Optimization

Arbor is a multi-agent framework that uses structured tree search as a cognition layer for autonomous agents operating in large, stateful ac

arxiv.org·7d ago

Multi-Agent Reinforcement Learning Reduces Drone Racing Collisions by 50% While Achieving Champion-Level Performance

This article presents research demonstrating that multi-agent reinforcement learning (MARL) enables superhuman performance in shared, dynami

rpg.ifi.uzh.ch·13d ago

JAMEL: A Framework for Joint Memory and Exploration Learning in Language Model Agents

This paper introduces JAMEL (Joint Agent Memory and Exploration Learning), a framework that trains language model agents to explore open-end

arxiv.org·17d ago

Comments

No comments yet. Be the first.