arxiv_cs_ai 2026年2月10日

ASA：不用训练的ツール呼び出しエンゲージメントの表現工程

ASA: Training-Free Representation Engineering for Tool-Calling Agents

Translated: 2026/2/14 8:19:37

Japanese Translation

LLMエージェントをドメイン固有のツール呼び出しに適応するには、状態が変化するインターフェイスに強いことは難しく、しばしばデモルやスキーマの工事が容易に行えるが、分布移行と嚴格なパーサにより脆弱であることを見ています。持続的にパラメータ効率的な微調整は信頼性を向上させるが、訓練、保守以及潜在的な忘れによってコストがかかります。我々は、ツールが必要なことはほとんどマージン層のアクティベーションで推測できることに注意を引かれますが、モデルはそのようにツールモードに進入することを見ない不健全なアジェンダの Failure Mode と呼びます。 ASA (Activation Steering Adapter) -訓練不要の一発行エントランスコントローラーにより、真の意向を増強しつも偽のトリガーを抑制するマージン層の中間への一瞬の変更を行います。これはルーター条件付きの混合 steering ベクトルとプロバイダが導かれガイドした正のガウンドによるものです。 ASA は、Qwen2.5-1.5B の MTU-Bench で 0.18 (strict tool-use F1) を改善しますから 0.50、偽のトポジストラクチャを間違えた正の事例の率も 0.05 削減しました。ASAを使うのに必要なのみ約20KBのポータブルアセットにとどまりますよ、重みは変更しません。

Original Content

arXiv:2602.04935v2 Announce Type: replace-cross Abstract: Adapting LLM agents to domain-specific tool calling remains notably brittle under evolving interfaces. Prompt and schema engineering is easy to deploy but often fragile under distribution shift and strict parsers, while continual parameter-efficient fine-tuning improves reliability at the cost of training, maintenance, and potential forgetting. We identify a critical Lazy Agent failure mode where tool necessity is nearly perfectly decodable from mid-layer activations, yet the model remains conservative in entering tool mode, revealing a representation-behavior gap. We propose Activation Steering Adapter (ASA), a training-free, inference-time controller that performs a single-shot mid-layer intervention and targets tool domains via a router-conditioned mixture of steering vectors with a probe-guided signed gate to amplify true intent while suppressing spurious triggers. On MTU-Bench with Qwen2.5-1.5B, ASA improves strict tool-use F1 from 0.18 to 0.50 while reducing the false positive rate from 0.15 to 0.05, using only about 20KB of portable assets and no weight updates.