arxiv_cs_lg 2026年4月24日

Transformers の学習における潜構造発見の階段的ダイナミクスに関する理解

Understanding the Staged Dynamics of Transformers in Learning Latent Structure

Translated: 2026/4/24 20:09:07

transformerslatent-structurecausal-interventionsmachine-learningneural-networks

Japanese Translation

arXiv:2511.19328v2 Announce Type: replace Abstract: 言語モデル化は、transformers が文脈から潜構造を発見できることを見出しましたが、それらの構造の異なる構成要素をどのように獲得するかというダイナミクスについては依然として poorly understood（よく理解されていない）であり、モデルが単にトレーニングデータを混ぜ合わせているという主張につながっています。本研究では、Alchemy ベンチマークを制御された環境（Wang et al., 2021）で活用し、潜構造の学習を調査しました。我々は小さい decoder-only transformer を 3 つのタスクバリアントに学習させました：1) 部分的な文脈情報から欠落された遷移を推測すること、2) 多段階シーケンスを解決するために単純な規則を構成すること、および 3) 複雑な多段階例を分解し中間の遷移を推測すること。各タスクを解釈可能な構成要素に分解することにより、我々はモデルが潜構造の異なる構成要素を離散的な段階で学習することを示しました。さらに、我々は非対称性を観察しました：モデルは基本的な遷移を堅牢に構成できますが、複雑な例を分解して原子的な遷移を発見することに困難を覚えます。最後に、因果介入を用いることで、凍結が段階の完了を著しく遅らせたり阻止したりする段階固有の可塑性情報帯を特定しました。これらの見解は、transformers モデルが潜構造をどのように獲得するかについての洞察を提供し、トレーニング中に能力がどのように進化するかに関する詳細な視点を与えます。

Original Content

arXiv:2511.19328v2 Announce Type: replace Abstract: Language modeling has shown us that transformers can discover latent structure from context, but the dynamics of how they acquire different components of that structure remain poorly understood, leading to assertions that models just remix training data. In this work, we use the Alchemy benchmark in a controlled setting (Wang et al.,2021) to investigate latent structure learning. We train a small decoder-only transformer on three task variants: 1) inferring missing transitions from partial contextual information, 2) composing simple rules to solve multi-transition sequences, and 3) decomposing complex multi-step examples to infer intermediate transitions. By factorizing each task into interpretable components, we show that the model learns the different latent structure components in discrete stages. We also observe an asymmetry: the model composes fundamental transitions robustly, but struggles to decompose complex examples to discover the atomic transitions. Finally, using causal interventions, we identify layer-specific plasticity windows during which freezing substantially delays or prevents stage completion. These findings provide insight into how a transformer model acquires latent structure, offering a detailed view of how capabilities evolve during training.