arxiv_cs_lg 2026年4月24日

暖開始されたマルコフ連鎖モンテカルロ (MCMC) フィニチューニングを用いた二次配置問題 (QAP) の学習

Learning to Solve the Quadratic Assignment Problem with Warm-Started MCMC Finetuning

Translated: 2026/4/24 19:56:17

quadratic-assignment-problemwarm-started-mcmcpermutation-learningenergy-based-modelsneural-architecture-search

Japanese Translation

二次配置問題（QAP）は、非決定的多項式時間問題の NP-hard の典型的な例であり、従来型のヒューリスティック解法だけでなく、最新の学習ベースのソルバーに対しても大きな挑戦を呈しています。既存の QAP ソルバーは、構造的に多様な実世界のインスタンスにおいて一貫した競争力のあるパフォーマンスを達成できていません。この性能のギャップを埋めるために、我々は効率的な暖開始された MCMC フィニチューニング手順を持つ革新的な置換学習フレームワークである PLMA を提案します。PLMA は、短マルコフ連鎖を用いて、以前探査された有望な領域への適応をアンカー化し、デプロイタイムのパフォーマンスを向上させることを特徴とします。さらに、置換空間への MCMC による迅速な探査を可能にするために、加算エネルギーベースモデル（EBM）を設計し、$O(1)$ 時間の 2-スワップメトロポリス・ハスティングのサンプリングステップを有効にします。また、EBM をパラメータ化するために使用されるニューラルネットワークは、QAP の施設と場所間の相互作用をモデル化するために、拡張性と柔軟性が高いクロスグラフ注意機構を統合しています。大規模な実験结果表明，PLMA は Various benchmarks において状態の最前線の基盤より一貫して優れています。特に、PLMA は QAPLIB において平均最適性ギャップをほぼゼロに達成し、有名な難易度の高い Taixxeyy インスタンスにおいて驚異的に優れる頑健性を示し、さらに帯域幅最小化において有効な QAP ソルバーとして機能します。

Original Content

arXiv:2604.20109v1 Announce Type: new Abstract: The quadratic assignment problem (QAP) is a fundamental NP-hard task that poses significant challenges for both traditional heuristics and modern learning-based solvers. Existing QAP solvers still struggle to achieve consistently competitive performance across structurally diverse real-world instances. To bridge this performance gap, we propose PLMA, an innovative permutation learning framework. PLMA features an efficient warm-started MCMC finetuning procedure to enhance deployment-time performance, leveraging short Markov chains to anchor the adaptation to the promising regions previously explored. For rapid exploration via MCMC over the permutation space, we design an additive energy-based model (EBM) that enables an $O(1)$-time 2-swap Metropolis-Hastings sampling step. Moreover, the neural network used to parameterize the EBM incorporates a scalable and flexible cross-graph attention mechanism to model interactions between facilities and locations in the QAP. Extensive experiments demonstrate that PLMA consistently outperforms state-of-the-art baselines across various benchmarks. In particular, PLMA achieves a near-zero average optimality gap on QAPLIB, exhibits remarkably superior robustness on the notoriously difficult Taixxeyy instances, and also serves as an effective QAP solver in bandwidth minimization.