arxiv_cs_lg 2026年4月24日

モデル内部探査: 現代の言語モデルにおける単語同定と屈折特徴の検索

Model Internal Sleuthing: Finding Lexical Identity and Inflectional Features in Modern Language Models

Translated: 2026/4/24 20:11:55

transformersnatural-language-processinglanguage-modelslinguistic-featuresneural-architecture-search

Japanese Translation

arXiv:2506.02132v5 Announce Type: replace-cross 摘要: 大規模なトランスフォーマー型言語モデルが現代的自然言語処理を支配しているにもかかわらず、彼らが言語情報をどのようにエンコードするかについての理解は、BERT や GPT-2 などの早期モデルに関する研究に主要に依存しています。BERT Base から Qwen2.5-7B までの 25 モデルについて、単語同定 (lexical identity) と屈折特徴 (inflectional features) の 2 つの言語学的特性に焦点を当てた体系的な解析を行いました。6 つの多様な言語を対象としました。私たちは一貫したパターンを見出し、それは屈折特徴がモデルを通じて線形に復元可能であるのに対し、単語同定は早期に顕著だが深層部に行くに従って徐々に弱まることです。表現幾何学に関するさらなる分析により、攻撃的なミドルレイヤーの次元数圧縮を示すモデルは、それらのレイヤーにおけるステアリング効率を低下させるものの、プローブ精度は高まることが明らかになりました。予備訓練の分析は、屈折構造が早期に安定化し、単語同定表現が引き続き進化を続けることを示しています。総合的に見ると、私たちの見解は、トランスフォーマーが層を超えて屈折特徴を維持しつつ、コンパクトで予測可能な表現を得るために単語同定を犠牲にしていることを示唆しています。当社のコードは https://github.com/ml5885/model_internal_sleuthing に利用可能です

Original Content

arXiv:2506.02132v5 Announce Type: replace-cross Abstract: Large transformer-based language models dominate modern NLP, yet our understanding of how they encode linguistic information relies primarily on studies of early models like BERT and GPT-2. We systematically probe 25 models from BERT Base to Qwen2.5-7B focusing on two linguistic properties: lexical identity and inflectional features across 6 diverse languages. We find a consistent pattern: inflectional features are linearly decodable throughout the model, while lexical identity is prominent early but increasingly weakens with depth. Further analysis of the representation geometry reveals that models with aggressive mid-layer dimensionality compression show reduced steering effectiveness in those layers, despite probe accuracy remaining high. Pretraining analysis shows that inflectional structure stabilizes early while lexical identity representations continue evolving. Taken together, our findings suggest that transformers maintain inflectional features across layers, while trading off lexical identity for compact, predictive representations. Our code is available at https://github.com/ml5885/model_internal_sleuthing