arxiv_cs_lg 2026年4月24日

潜在確率的補完因子

Latent Stochastic Interpolants

Translated: 2026/4/24 20:07:45

latent-stochastic-interpolantsgenerative-modelingvariational-inferencediffusion-modelsimage-generation

Japanese Translation

arXiv:2506.02276v2 発表タイプ：置換要約：確率的補完因子（Stochastic Interpolants、SI）は、二つの確率分布の間を柔軟に変換できる、生成モデルにおける強力な枠組みです。しかし、その用途は、二つの分布からサンプリングしたデータに直接アクセスしなければならないため、共的に最適化された潜在変数モデルでは未探索のままです。本稿では、端到端最適化されたエンコーダー、デコーダー、および潜在空間の SI モデルを含む潜在確率的補完因子（Latent Stochastic Interpolants、LSI）を提案し、共学習を可能にしました。これは、連続時間において直接導出された原理的な Evidence Lower Bound（ELBO）目的関数の開発によって達成されました。共最適化により、LSI は効果的な潜在表現とともに、任意の先行分布をエンコーダーで定義された合成後方に変換する生成過程を学習することができます。LSI は、正規拡散モデルの単純な先行分布を回避し、SI を高次元観測空間に直接適用する計算上の負荷を軽減するとともに、SI フレームワークの生成柔軟性を維持します。本稿では、標準的大規模 ImageNet 生成ベンチマークにおける包括的な実験を通じて LSI の有効性を示しました。

Original Content

arXiv:2506.02276v2 Announce Type: replace Abstract: Stochastic Interpolants (SI) is a powerful framework for generative modeling, capable of flexibly transforming between two probability distributions. However, its use in jointly optimized latent variable models remains unexplored as it requires direct access to the samples from the two distributions. This work presents Latent Stochastic Interpolants (LSI) enabling joint learning in a latent space with end-to-end optimized encoder, decoder and latent SI models. We achieve this by developing a principled Evidence Lower Bound (ELBO) objective derived directly in continuous time. The joint optimization allows LSI to learn effective latent representations along with a generative process that transforms an arbitrary prior distribution into the encoder-defined aggregated posterior. LSI sidesteps the simple priors of the normal diffusion models and mitigates the computational demands of applying SI directly in high-dimensional observation spaces, while preserving the generative flexibility of the SI framework. We demonstrate the efficacy of LSI through comprehensive experiments on the standard large scale ImageNet generation benchmark.