arxiv_cs_lg 2026年4月24日

収束進化：異なる言語モデルが類似的な数表現を学習する方法

Convergent Evolution: How Different Language Models Learn Similar Number Representations

Translated: 2026/4/24 20:06:42

language-modelsconvergent-evolutionnumber-representationtransformersfourier-analysis

Japanese Translation

arXiv:2604.20817v1 Announce Type: cross 要旨：言語モデルは自然テキストで訓練され、数字を主要周期が $T=2, 5, 10$ である周期的特徴を使って表現します。本稿では、これらの特徴の2段階の階層を特定します。トランスフォーマー、線形 RNN、LSTM、クラシックな単語埋め込みといった異なる方法で訓練されたモデルは、フーリエ領域に period-$T$ スパイクを持つ特徴を学習しますが、数 mod-$T$ を線形分類するために使えてしまう幾何学的に分離可能な特徴を学習するのは一部に過ぎません。この矛盾を説明するために、フーリエ領域の疎性が mod-$T$ 幾何学的分離性に必要だが十分でないことを証明します。実証的研究により、モデル訓練が幾何学的に分離可能な特徴を生み出す条件を調べ、データ、アーキテクチャ、最適化器、トークナライザーのすべてが重要な役割を果たしていることを発見します。特に、モデルが幾何学的に分離可能な特徴を獲得する二つの異なるルートを特定します：それは一般の言語データ（テキストと数の共起、および数と数の相互作用を含む）における補完的な共起信号から学習できる、またはマルチトークン（ただし単一トークンの）加法問題から学習できるというものです。全体的に、当社の結果は、特徴学習における収束進化の現象を強調しています：多様な種類のモデルは、異なる訓練信号から類似的な特徴を学習します。

Original Content

arXiv:2604.20817v1 Announce Type: cross Abstract: Language models trained on natural text learn to represent numbers using periodic features with dominant periods at $T=2, 5, 10$. In this paper, we identify a two-tiered hierarchy of these features: while Transformers, Linear RNNs, LSTMs, and classical word embeddings trained in different ways all learn features that have period-$T$ spikes in the Fourier domain, only some learn geometrically separable features that can be used to linearly classify a number mod-$T$. To explain this incongruity, we prove that Fourier domain sparsity is necessary but not sufficient for mod-$T$ geometric separability. Empirically, we investigate when model training yields geometrically separable features, finding that the data, architecture, optimizer, and tokenizer all play key roles. In particular, we identify two different routes through which models can acquire geometrically separable features: they can learn them from complementary co-occurrence signals in general language data, including text-number co-occurrence and cross-number interaction, or from multi-token (but not single-token) addition problems. Overall, our results highlight the phenomenon of convergent evolution in feature learning: A diverse range of models learn similar features from different training signals.