arxiv_cs_ai 2026年4月20日

マルチエージェントからシングルエージェントへ：スキルディストイルが有効な時期は？

From Multi-Agent to Single-Agent: When Is Skill Distillation Beneficial?

Translated: 2026/4/20 11:16:38

multi-agentskill-distillationreinforcement-learningmetric-freedomadaptive-distillation

Japanese Translation

arXiv:2604.01608v2 発表タイプ：置き換え摘要：マルチエージェントシステム（MAS）は複雑なタスクを解決するために専門性を分散しますが、これはしばしば大きな調整オーバーヘッド、文脈の断片化、そして脆弱なフェーズ順序をもたらします。MAS をシングルエージェントのスキルに дистилルすることで、これらのコストを回避できますが、この変換においていつ、何を дистилルすべきかについての原理的な答えは欠如しています。代わりに、実証的な結果は驚くほど一貫性がなく、同じタスクの指標でスキル向上が 28% の改善から 2% の劣化にまで及びます。本稿では、スキルの有用性はタスクによってではなく、評価指標によって支配されていることを明らかにしました。我々は「メトリックフリームlessness (F)」という、スキル有用性の最初の先天的予測子を導入しました。F は、メトリクスのスコアリング景観のトポロジーの剛性を、マンテルテストを通じて出力の多様性がスコアの変動とどのようにカップリングするかを定量化することで測定します。F を導くと、2 ステージの適応型 дистилルフレームワークである AdaSkill を提案しました。ステージ 1 は選択的抽出メカニズムとして作用し、ツールと知識を抽出し、「フリー」の指標に対して制約的な構造を棄却して探索を保存します。ステージ 2 は、フリー指標の平坦なスコアリング景観を利用して、オシレーションをなしつつ安全に残りのヘッドルームを最大化するために、フリー指標上で反復的な微調整を適用します。4 つのタスク、11 つのデータセット、6 つの指標を評価した結果、F はスキル有用性を強く予測し（r=-0.85, p<0.0001）、示唆的なことは、同一のエージェント軌跡が剛性の高いメトリックに対しては全く異なるスキル向上をもたらすということです。これら信号に基づき、AdaSkill は元の MAS に匹敵、またはそれを超え、コストを最大 8 倍削減し、レイテンシーを最大 15 倍削減しました。

Original Content

arXiv:2604.01608v2 Announce Type: replace Abstract: Multi-agent systems (MAS) tackle complex tasks by distributing expertise, though this often comes at the cost of heavy coordination overhead, context fragmentation, and brittle phase ordering. Distilling a MAS into a single-agent skill can bypass these costs, but this conversion lacks a principled answer for when and what to distill. Instead, the empirical outcome is surprisingly inconsistent: skill lift ranges from a 28% improvement to a 2% degradation across metrics of the exact same task. In this work, we reveal that skill utility is governed not by the task, but by the evaluation metric. We introduce Metric Freedom (F), the first a priori predictor of skill utility. F measures the topological rigidity of a metric's scoring landscape by quantifying how output diversity couples with score variance via a Mantel test. Guided by F, we propose AdaSkill, a two-stage adaptive distillation framework. Stage 1 acts as a selective extraction mechanism, extracting tools and knowledge while discarding restrictive structures on "free" metrics to preserve exploration. Stage 2 applies iterative refinement on free metrics, exploiting their flat scoring landscape to safely maximize remaining headroom without oscillation. Evaluating across 4 tasks, 11 datasets, and 6 metrics, F strongly predicts skill utility (r=-0.85, p<0.0001). Strikingly, identical agent trajectories yield diametrically opposite skill lifts under rigid versus free metrics, demonstrating that skill utility is fundamentally a metric-level property. Driven by this signal, AdaSkill matches or exceeds the original MAS while reducing cost up to 8x and latency by up to 15x.