arxiv_cs_ai 2026年4月24日

内在地低次元なデータに対するスコアマッチング拡散モデルの一般化特性

Generalization Properties of Score-matching Diffusion Models for Intrinsically Low-dimensional Data

Translated: 2026/4/24 20:34:53

diffusion-modelsoptimal-transportstatistical-geometrywasserstein-distancemachine-learning

Japanese Translation

arXiv:2603.03700v2 Announce Type: replace-cross 要約: スコアベースの拡散モデルが示す経験的な成功は驚くべきものでありながら、それらに関する統計的保証は未発展のままです。既存の分析は、自然画像など現実に見られる内在地低次元構造を反映しない、過酷な収束速度を提供することが多いのです。本論文では、有限数のサンプリングから未知の分布 $\mu$ を学習するためのスコアベースの拡散モデルの統計的収束性を研究します。前進拡散過程とデータ分布の mildな正則性の仮定下において、学習された生成分布の水素距離 $p$ ($\mathbb{W}_p$) で測られる有限サンプル誤差の境界を導出します。先例の結果とは異なり、我々の保証は全ての $p \ge 1$ について成立し、コンパクト支持、多様体、または滑らかな密度の条件を必要とせず、$\mu$ に対する有限モーメントの仮定だけで十分です。具体的には、有限 $q$ 乗モーメントを持つ $\mu$ からの $n$ 個の独立同分布のサンプルと、適切に選択されたニューロンアーキテクチャ、ハイパーパラメータ、そして离散化スキームを与えたとき、我々は学習された分布 $\hat{\mu}$ と $\mu$ の間の期待水素 $p$ 誤差が $\mathbb{E}\, \mathbb{W}_p(\hat{\mu},\mu) = \widetilde{O}\!\ig(n^{-1 / d^\*_{p,q}(\mu)}\big)$ というスケールを持つことを示します。ここで $d^\*_{p,q}(\mu)$ は $\mu$ の $(p,q)$-水素次元です。我々の結果は、拡散モデルがデータの内在幾何学に自然に適応し、次元の呪いを軽減することを示しています。なぜなら、収束速度は埋め込み次元ではなく $d^\*_{p,q}(\mu)$ に依存するからです。さらに、我々の理論は拡散モデルの解析と GAN の解析、および最適輸送に確立されたシャープミニマックス速度を概念的につながらせます。提案された $(p,q)$-水素次元は、非有界支持を持つ分布に至る古典的水素次元の概念を拡大し、これは独立した理論的興味を持つ可能性があります。

Original Content

arXiv:2603.03700v2 Announce Type: replace-cross Abstract: Despite the remarkable empirical success of score-based diffusion models, their statistical guarantees remain underdeveloped. Existing analyses often provide pessimistic convergence rates that do not reflect the intrinsic low-dimensional structure common in real data, such as that arising in natural images. In this work, we study the statistical convergence of score-based diffusion models for learning an unknown distribution $\mu$ from finitely many samples. Under mild regularity conditions on the forward diffusion process and the data distribution, we derive finite-sample error bounds on the learned generative distribution, measured in the Wasserstein-$p$ distance. Unlike prior results, our guarantees hold for all $p \ge 1$ and require only a finite-moment assumption on $\mu$, without compact-support, manifold, or smooth-density conditions. Specifically, given $n$ i.i.d.\ samples from $\mu$ with finite $q$-th moment and appropriately chosen network architectures, hyperparameters, and discretization schemes, we show that the expected Wasserstein-$p$ error between the learned distribution $\hat{\mu}$ and $\mu$ scales as $\mathbb{E}\, \mathbb{W}_p(\hat{\mu},\mu) = \widetilde{O}\!\left(n^{-1 / d^\ast_{p,q}(\mu)}\right),$ where $d^\ast_{p,q}(\mu)$ is the $(p,q)$-Wasserstein dimension of $\mu$. Our results demonstrate that diffusion models naturally adapt to the intrinsic geometry of data and mitigate the curse of dimensionality, since the convergence rate depends on $d^\ast_{p,q}(\mu)$ rather than the ambient dimension. Moreover, our theory conceptually bridges the analysis of diffusion models with that of GANs and the sharp minimax rates established in optimal transport. The proposed $(p,q)$-Wasserstein dimension also extends the notion of classical Wasserstein dimension to distributions with unbounded support, which may be of independent theoretical interest.