arxiv_cs_lg 2026年2月10日

ユニバーサル近似のための並列レイヤー正規化

Parallel Layer Normalization for Universal Approximation

Translated: 2026/3/15 9:05:13

neural-networkslayer-normalizationuniversal-approximationmachine-learning

Japanese Translation

arXiv:2505.13142v2 発表タイプ: 更新要約：本稿では、層正規化（LN）と線形層を組み合わせるニューラルネットワークの近似能力を研究します。2 つの線形層の間に並列レイヤー正規化（PLN）を挿入した（PLN-Nets として呼称）ネットワークはユニバーサル近似能达到することを証明し、一方で標準的な LN を使用するアーキテクチャは厳密に制限された表現力を示すことがわかります。さらに、浅いおよび深い PLN-Nets の近似速度を $L^\\\\ orm{\infty}$ ノルムおよびソボレフノルムで分析します。我々の分析は LN から RMSNorm へ、標準的な MLP から RNN および Transformer に使用されるコアビルディングブロックであるポジションワイズフィードフォワードネットワークへと拡張されています。最後に、PLN-Nets の他の可能性を調べるための実証実験を提供します。

Original Content

arXiv:2505.13142v2 Announce Type: replace Abstract: This paper studies the approximation capabilities of neural networks that combine layer normalization (LN) with linear layers. We prove that networks consisting of two linear layers with parallel layer normalizations (PLNs) inserted between them (referred to as PLN-Nets) achieve universal approximation, whereas architectures that use only standard LN exhibit strictly limited expressive power.We further analyze approximation rates of shallow and deep PLN-Nets under the $L^\infty$ norm as well as in Sobolev norms. Our analysis extends beyond LN to RMSNorm, and from standard MLPs to position-wise feed-forward networks, the core building blocks used in RNNs and Transformers.Finally, we provide empirical experiments to explore other possible potentials of PLN-Nets.