arxiv_cs_lg 2026年4月24日

ローカル拡散モデルとデータの分布の相

Local Diffusion Models and Phases of Data Distributions

Translated: 2026/4/24 20:08:17

diffusion-modelsstatistical-physicsphase-transitionsgenerative-aineural-networks

Japanese Translation

統計物理学に着想を得た生成 AI フレームワークの一つとして、拡散モデルはスコア関数に徐々に案内される去ノイズ過程を通じて複雑なデータ分布を合成する際、画期的な性能を示してきた。現実のデータ（例：画像）は低次元空間においてしばしば空間的に構造化されている。しかし、通常の拡散モデルはこの局所的構造を無視し、計算コストが高くつくことが多くスコア関数を空間的に全域で学習する。本稿では、非平衡統計物理学の最近の進歩に触発され、データ分布の相を定義する汎用的なフレームワークを開発し、拡散モデルにおける去ノイサーの局所的要件を分析することを目的とする。我々は、拡散の進行過程と同様の空間的に局所的な操作（例：局所去ノイサー）によって互いに接続可能であれば、2 つの分布が同一のデータ分布の相に属すると定義する。逆去ノイズ過程は、局所去ノイサーが機能不能となる迅速な相転移によって挟まれる、初期の単純な相と後期のデータ相の 2 つの相に分かれることを示した。また、局所去ノイサーの性能は空間マルコフ性（spatial Markovianity）とは密接に関連しており、この相転移の診断に対する実用的な基準を提供するとさらに示した。我々は、数値実験を通じてこの基準を現実世界のデータセットで検証した。本研究は、より単純かつ効率的な拡散モデルアーキテクチャの構築に向けて指針を示す：相転移ポイントから遠ざかった領域では、スコア関数を計算するために小型の局所ニューラルネットワークを使用可能であり、相転移の狭い時間間隔ではのみグローバルニューラルネットワークが必要となる。この結果は、データの分布の相の研究、生成 AI の広範な科学、そして物理学の概念に着想を得たニューラルネットワークの設計に向けた新たな方向性を示唆する。

Original Content

arXiv:2508.06614v2 Announce Type: replace Abstract: As a class of generative artificial intelligence frameworks inspired by statistical physics, diffusion models have shown extraordinary performance in synthesizing complicated data distributions through a denoising process gradually guided by score functions. Real-life data, like images, is often spatially structured in low-dimensional spaces. However, ordinary diffusion models ignore this local structure and learn spatially global score functions, which are often computationally expensive. In this work, motivated by recent advances in non-equilibrium statistical physics, we develop a generic framework for defining phases of data distributions and use it to analyze the locality requirements of denoisers in diffusion models. We define two distributions as belonging to the same data distribution phase if they can be mutually connected via spatially local operations such as local denoisers, along the same evolution path as the diffusion. We demonstrate that the reverse denoising process consists of an early trivial phase and a late data phase, sandwiching a rapid phase transition where local denoisers must fail. We further demonstrate that the performance of local denoisers is closely tied to spatial Markovianity, which provides an operational criterion for diagnosing such phase transitions. We validate this criterion through numerical experiments on real-world datasets. Our work suggests guidance for simpler and more efficient architectures of diffusion models: far from the phase transition point, we can use small local neural networks to compute the score function; global neural networks are only necessary around the narrow time interval of phase transitions. This result also opens up new directions for studying phases of data distributions, the broader science of generative artificial intelligence, and guiding the design of neural networks inspired by physics concepts.