Qwen-AgentWorld: Language World Models for Simulating Agentic Environments Across 7 Domains
By
[Submitted on 23 Jun 2026]
Summary
This paper introduces Qwen-AgentWorld, a family of language world models (35B-A3B and 397B-A17B) designed to simulate agentic environments across 7 domains using long chain-of-thought reasoning. The models are trained on over 10 million real-world interaction trajectories through a three-stage pipeline (CPT, SFT, RL). The authors also present AgentWorldBench, a benchmark for evaluating language world models, and explore two paradigms where world modeling enhances general agents: as a decoupled environment simulator for scalable agentic RL training, and as a unified agent foundation model where world-model training serves as an effective warm-up for downstream tasks.
Source
Key quotes
· 5 pulledA world model predicts environment dynamics based on current observations and actions, serving as a core cognitive mechanism for reasoning and planning.
We introduce Qwen-AgentWorld-35B-A3B and Qwen-AgentWorld-397B-A17B, the first language world models capable of simulating agentic environments covering 7 domains via long chain-of-thought reasoning.
Empirical results demonstrate that Qwen-AgentWorld significantly outperforms existing frontier models.
As a decoupled environment simulator, Qwen-AgentWorld supports scalable and controllable simulation of thousands of real-world environments for agentic RL, yielding gains that surpass real-environment training alone.
As a unified agent foundation model, world-model training acts as a highly effective warm-up that improves downstream performance across 7 agentic benchmarks.
You might also wanna read
DILLO: A Language-Based World Model for Proactive Agent Steering Without Visual Simulation
This paper introduces DILLO (DIstiLLed Language-ActiOn World Model), a proactive agent steering framework that replaces slow visual simulati
AgentGym-RL: A Reinforcement Learning Framework for Training LLM Agents in Multi-Turn Decision Making
This paper introduces AgentGym-RL, a unified reinforcement learning framework for training LLM agents to perform multi-turn interactive deci
AgentGym-RL: A Reinforcement Learning Framework for Training LLM Agents in Multi-Turn Decision Making
This paper introduces AgentGym-RL, a unified reinforcement learning framework for training LLM agents to perform multi-turn interactive deci
Skill-MAS: A Meta-Skill Approach to Improving Multi-Agent Systems Without Retraining
Skill-MAS proposes a novel approach to LLM-based automatic Multi-Agent Systems (MAS) generation that bridges the gap between inference-time
Skill-MAS: A Meta-Skill Approach to Improving Multi-Agent Systems Without Retraining
Skill-MAS proposes a novel approach to LLM-based automatic Multi-Agent Systems (MAS) generation that bridges the gap between inference-time
LifeSkill: A Reinforcement Learning Framework for Online Lifelong Learning in LLM Agents
This paper introduces LifeSkill, a two-stage reinforcement learning framework for online lifelong learning in Large Language Model (LLM) age
How AI World Models Are Bridging Simulation and Reality
This article explores the concept of "world models" in AI — where AI systems learn and practice tasks through simulated environments within
JAMEL: A Framework for Joint Memory and Exploration Learning in Language Model Agents
This paper introduces JAMEL (Joint Agent Memory and Exploration Learning), a framework that trains language model agents to explore open-end
Comments
Sign in to join the conversation.
No comments yet. Be the first.
