arxiv_cs_ai 2026年4月24日

線形系に対する強化学習とモデル予言制御の統合：体系的レビューと分類法

A Systematic Review and Taxonomy of Reinforcement Learning-Model Predictive Control Integration for Linear Systems

Translated: 2026/4/24 20:22:41

reinforcement-learningmodel-predictive-controlsystematic-reviewlinear-systemscontrol-theory

Japanese Translation

arXiv:2604.21030v1 Announce Type: cross 摘記: モデル予言制御（MPC）と強化学習（RL）の統合は、制約付き意思決定と適応制御のために有望なパラダイムとして台頭してきた。MPC は構造化された最適化、明示的な制約処理、確立された安定性ツールを提供する一方、RL は不確実性とモデル不整合が存在する環境でのデータ駆動型適応と性能向上を提供する。RL--MPC 統合に関する研究が急速に成長したにも関わらず、文献は、線形または線形化予言モデルに基づいた制御アーキテクチャを特に含める場合、断片的であるままとなっている。本稿は、2025 年まで発表された査読論文と形式化されたインデックス研究をカバーする、線形系および線形化系における RL--MPC 統合に関する包括的な体系的文献レビュー（SLR）を提示する。レビューした研究は、RL の機能役割、RL アルゴリズムクラス、MPC 形式化、コスト関数の構造、および応用領域をカバーする多次的分類法を通じて組織化されている。さらに、これらの次元における再現的な設計パターンとレビューされたコーパス内での相互間の報告された関連性を特定するための横断的な合成が行われている。レビューは、手法的趋势、一般的に採用されている統合戦略、および計算負荷、サンプリング効率、頑健性、閉ループ保証を含む再現的な実用的な課題に焦点を当てている。得られた合成は、線形または線形化予言制御形式化に基づいて RL--MPC アーキテクチャを設計または解析しようとする研究者および実務者にとって構造化された参照資料を提供する。

Original Content

arXiv:2604.21030v1 Announce Type: cross Abstract: The integration of Model Predictive Control (MPC) and Reinforcement Learning (RL) has emerged as a promising paradigm for constrained decision-making and adaptive control. MPC offers structured optimization, explicit constraint handling, and established stability tools, whereas RL provides data-driven adaptation and performance improvement in the presence of uncertainty and model mismatch. Despite the rapid growth of research on RL--MPC integration, the literature remains fragmented, particularly for control architectures built on linear or linearized predictive models. This paper presents a comprehensive Systematic Literature Review (SLR) of RL--MPC integrations for linear and linearized systems, covering peer-reviewed and formally indexed studies published until 2025. The reviewed studies are organized through a multi-dimensional taxonomy covering RL functional roles, RL algorithm classes, MPC formulations, cost-function structures, and application domains. In addition, a cross-dimensional synthesis is conducted to identify recurring design patterns and reported associations among these dimensions within the reviewed corpus. The review highlights methodological trends, commonly adopted integration strategies, and recurring practical challenges, including computational burden, sample efficiency, robustness, and closed-loop guarantees. The resulting synthesis provides a structured reference for researchers and practitioners seeking to design or analyze RL--MPC architectures based on linear or linearized predictive control formulations.