arxiv_cs_lg 2026年4月24日

階層型シミュレーションに基づく推論のためのトークナイズドフローマッチング

Tokenised Flow Matching for Hierarchical Simulation Based Inference

Translated: 2026/4/24 20:00:15

tokenized-flow-matchingsimulation-based-inferencehierarchical-modelsneural-surrogatesarxiv-2604

Japanese Translation

arXiv:2604.20723v1 Announce Type: new 要旨：シミュレーションに基づく推論 (SBI) の際、シミュレーター評価のコストは実際の運用における重要なボトルネックです。共有された全局パラメータと交換可能なサイトレベルのパラメータ、観測値を持つ階層的な設定において、この構造を効果的に活用してシミュレーション効率を向上させることができます。既存の階層型 SBI アプローチでは-posterior を因子分解しているが、それでもトレーニングサンプルごとに複数回のサイトごとのシミュレーションを行っているため、我々は代わりに、単一サイトのシミュレーションからトレーニングを行うための尤の因子分解 (LF) を探求しました。LF サンプリングでは、サイトごとのニューラルプロキシを学習し、それらを統合して合成的な複数サイトの観測値を生成し、階層型-posterior の全推論コストを均質化します。これに続き、函数値の観測値をサポートする尤の因子分解を用いたトークナイズドフローマッチングアプローチである、Posterior Estimation ためのトークナイズドフローマッチング (TFMPE) を提案しました。体系的な評価を可能にするために、階層型 SBI 用のベンチマークを導入しました。TFMPE をこのベンチマーク、かつ現実的な感染症モデルおよび計算流体力学モデルで検証し、よく適合した-posterior を得た一方で、計算コストを削減したことが確認されました。

Original Content

arXiv:2604.20723v1 Announce Type: new Abstract: The cost of simulator evaluations is a key practical bottleneck for Simulation Based Inference (SBI). In hierarchical settings with shared global parameters and exchangeable site-level parameters and observations, this structure can be exploited to improve simulation efficiency. Existing hierarchical SBI approaches factorise the posterior yet still simulate across multiple sites per training sample; We instead explore likelihood factorisation (LF) to train from single-site simulations. In LF sampling we learn a per-site neural surrogate of the simulator and then assemble synthetic multi-site observations to amortise inference for the full hierarchical posterior. Building on this, we propose Tokenised Flow Matching for Posterior Estimation (TFMPE), a tokenised flow matching approach that supports function-valued observations through likelihood factorisation. To enable systematic evaluation, we introduce a benchmark for hierarchical SBI. We validate TFMPE on this benchmark and on realistic infectious disease and computational fluid dynamics models, finding well-calibrated posteriors while reducing computational cost.