arxiv_cs_ai 2026年4月24日

見えないものを見る：変換器（Transformer）の記号的推理における一般化能力について

To See the Unseen: on the Generalization Ability of Transformers in Symbolic Reasoning

Translated: 2026/4/24 20:18:06

transformerssymbolic-reasoninggeneralizationarxivgemma

Japanese Translation

arXiv:2604.21632v1 Announce Type: new 要旨：当研究では、デコーダーのみ（decoder-only）の変換器モデルの抽象記号的推理の実行能力を調査します。具体的には、コンテキスト内に提示された命題論理推理問題の解決能力に焦点を当てています。過去の研究では、モデルが訓練時に観測された変数名を含まない問題に一般化できないことを示しており、その理由の一つとして未観測トークンのコピー（または生成）の困難さが指摘されています。当研究では、理論的・経験的両面から、この現象が重要な役割を果たすことを示しています：未観測トークンの unembeddings（最終層の重み）は訓練中にほぼ同一のベクトルへと収束します。この収束により、モデルは複数の未観測変数を区別する困難に直面します（特にエンベディングと unembedding パラメータが共有されている場合）。また、これにより、既存の直感的介入手法（例：定期的トークンの unembedding をリセットする「アクティブ・フォゲット」）の有効性を機能的に説明することができます。これらの観察に基づき、コピーを促進するアーキテクチャ変更、データ多様性、および unembeddings の凍結やリセットを組み合わせる手法を設計し、未観測トークンへの一般化を実現しました。これらの主張を支持するために、命題論理推理問題の広範な制御実験を施行しました。合成実験だけでなく、Gemma 3 ファミリ（99 の下流用途用に予約された未使用トークンを含む）のオープンウェイトモデルにおける (un)embedding 収束の証拠も観測されています。実証的には、これらのトークンの相関したエンベディングはファインチューニング応用への不良な初期化となることを発見しました。

Original Content

arXiv:2604.21632v1 Announce Type: new Abstract: We investigate the ability of decoder-only transformer models to perform abstract symbolic reasoning; specifically solving propositional logic reasoning problems given in-context. Previous work demonstrated that models fail to generalize to problems involving variable names that were not observed during training, and it was shown that one reason behind this is the difficulty of copying (or generating) unseen tokens. We show both theoretically and empirically that a particular representational collapse also has a crucial role: the unembeddings (last-layer weights) of unseen tokens collapse to nearly the same vector during training. The collapse makes distinguishing multiple unseen variables difficult for the model (especially when the embedding and unembedding parameters are shared), and provides a mechanistic explanation for the effectiveness of existing heuristic interventions like "active forgetting", which periodically reset the token (un)embeddings. Based on these observations, we devise a combination of techniques, involving a small architecture change facilitating copying, data diversity, and freezing or resetting (un)embeddings, that achieves generalization to unseen tokens. We support our claims with extensive controlled experiments on propositional logic reasoning problems. Beyond synthetic experiments, we also observe evidence of (un)embedding collapse in the open-weight models in the Gemma 3 family, which includes 99 unused tokens reserved for downstream use. Empirically we find that the correlated embeddings of these tokens are a poor initialization for finetuning applications.