Skill-MAS: A Meta-Skill Approach to Improving Multi-Agent Systems Without Retraining

[Submitted on 17 Jun 2026]

2d ago· 2 min readenInsight

Summary

Skill-MAS proposes a novel approach to LLM-based automatic Multi-Agent Systems (MAS) generation that bridges the gap between inference-time methods (which use frozen frontier LLMs but don't learn from experience) and training-time methods (which learn via gradient updates but are limited by smaller model capabilities). The approach introduces a "Meta-Skill" concept that decouples experience retention from parametric updates, using a closed optimization loop with Multi-Trajectory Rollout and Selective Reflection. Experiments across four benchmarks and four LLMs show performance gains with favorable cost-performance trade-offs, and the evolved Meta-Skills demonstrate robustness and transferability across unseen tasks and different LLMs.

Source

Twitter / XSkill-MAS: A Meta-Skill Approach to Improving Multi-Agent Systems Without Retrainingarxiv.org

Key quotes

· 5 pulled

Skill-MAS, a novel third path that decouples experience retention from parametric updates by conceptualizing the high-level orchestration capability as an evolvable Meta-Skill.

Inference-time MAS leverages frozen frontier LLMs but repeats identical searches without learning from past experience.

Training-time MAS internalizes experience via gradient updates but is constrained by the low capability ceiling of smaller models, and is hard to scale to large frontier LLMs.

Extensive experiments across four complex benchmarks and four distinct LLMs demonstrate that Skill-MAS not only achieves remarkable performance gains but also maintains a favorable cost-performance trade-off.

Further analysis reveals that the evolved Meta-Skills are highly robust and exhibit strong transferability across unseen tasks and different LLMs.

Snippet from the RSS feed

Large Language Model (LLM)-based automatic Multi-Agent Systems (MAS) generation has become a crucial frontier for tackling complex tasks. However, existing methods face a dilemma between model capability and experience retention. Inference-time MAS levera

You might also wanna read

SkillsBench: A Benchmark for Evaluating AI Agent Skills Across Diverse Tasks

SkillsBench is a new benchmark for evaluating how well AI agent skills work across diverse tasks. The benchmark includes 86 tasks across 11

arxiv.org·4mo ago

EvoTrainer: A Framework for Co-Evolving LLM Policies and Training Harnesses in Autonomous Agentic Reinforcement Learning

The article introduces EvoTrainer, an autonomous training framework for LLMs that goes beyond traditional recipe search by co-evolving both

arxiv.org·19d ago

Arbor: A Multi-Agent Framework Using Tree Search for Autonomous LLM Inference Optimization

Arbor is a multi-agent framework that uses structured tree search as a cognition layer for autonomous agents operating in large, stateful ac

arxiv.org·7d ago

Latent Agents: Distilling Multi-Agent Debate into Single LLMs via Post-Training Internalization

This paper introduces "Latent Agents," a post-training framework that distills multi-agent debate into a single LLM through a two-stage fine

arxiv.org·17d ago

Latent Agents: Distilling Multi-Agent Debate into Single LLMs via Post-Training Internalization

This paper introduces "Latent Agents," a post-training framework that distills multi-agent debate into a single LLM through a two-stage fine

arxiv.org·17d ago

OpenClaw-Skill: A Collective Skill Tree Search Framework for LLM-Based AI Agents

A new research paper introduces OpenClaw-Skill, a Collective Skill Tree Search (CSTS) framework that addresses key limitations in LLM-based

clawbeat.co·2d ago

Microsoft and researchers develop SkillOpt to train AI agent instruction documents like model weights

Microsoft and three Chinese universities developed SkillOpt, a method that optimizes instruction documents (Markdown files) for AI agents us

the-decoder.com·9d ago

Comments

No comments yet. Be the first.