arxiv_cs_cv 2026年2月10日

非滑らかな要素は Vision Transformer のフィニュートニングに有利となる

Vision Transformer Finetuning Benefits from Non-Smooth Components

Translated: 2026/3/15 17:04:10

vision-transformerfinetuningplasticitytransfer-learningneural-networks

Japanese Translation

arXiv:2602.06883v2 Announce Type: replace-cross 摘要：トランスフォーマーアーキテクチャの滑らかさ（smoothness）は、汎化能力、トレーニング安定性、および对抗性頑健性といった文脈で広く研究されてきた。しかし、転移学習におけるその役割は十分には理解されていない。本論文では、ビジョントランスフォーマーのコンポーネントが入力変化に対して出力を適応させる能力、すなわち「可塑性（plasticity）」を分析する。これは変化率の平均として定義され、入力摂乱に対する感度を捉えるものであり、特に高い可塑性は低い滑らかさに対応する。理論的分析と包括的な実験を通じて、この視点が適応過程における優先すべきコンポーネントの選択に原理的な指針を与えることを示した。実践者にとっての主要な見通しは、アテンションモジュールおよび前馈層の高い可塑性が一貫して優れたフィニュートニング性能をもたらすという点である。我々の見解は、滑らかさが望ましいという既存の仮説から異なり、トランスフォーマーの機能特性に関する新しい視点を提供する。コードは https://github.com/ambroiseodt/vit-plasticity に入手可能です。

Original Content

arXiv:2602.06883v2 Announce Type: replace-cross Abstract: The smoothness of the transformer architecture has been extensively studied in the context of generalization, training stability, and adversarial robustness. However, its role in transfer learning remains poorly understood. In this paper, we analyze the ability of vision transformer components to adapt their outputs to changes in inputs, or, in other words, their plasticity. Defined as an average rate of change, it captures the sensitivity to input perturbation; in particular, a high plasticity implies low smoothness. We demonstrate through theoretical analysis and comprehensive experiments that this perspective provides principled guidance in choosing the components to prioritize during adaptation. A key takeaway for practitioners is that the high plasticity of the attention modules and feedforward layers consistently leads to better finetuning performance. Our findings depart from the prevailing assumption that smoothness is desirable, offering a novel perspective on the functional properties of transformers. The code is available at https://github.com/ambroiseodt/vit-plasticity.