arxiv_cs_lg 2026年4月24日

ファーストオーダー二段階最小化最大化最適化アルゴリズムの安定性と一般化性能に関する考察

On the Stability and Generalization of First-order Bilevel Minimax Optimization

Translated: 2026/4/24 19:56:26

bilevel-optimizationminimaxgeneralizationgradient-descentmachine-learning

Japanese Translation

arXiv:2604.20115v1 発表タイプ：新規要約：二段階最適化および二段階最小化最大化最適化は、ハイパーパラメータ最適化や強化学習の多岐にわたる機械学習タスクを統一的枠組みとして近年台頭しています。既存の文獻は実効性と収束保証に焦点を当てており、これらのアルゴリズムの一般化性能に関する理解においては決定的な理論的ギャップが存在しています。このギャップを埋めるために、下段階に最小化最大化問題を持つファーストオーダー勾配ベースの二段階最小化最大化ソルバーに対する、最初の系統的な一般化分析を提供します。具体的には、アルゴリズムの安定性論を借用することで、単一タイムスケールの確率勾配昇降法（SGDA）および二重タイムスケールの SGDA の 2 つの変形を含めた、3 つの代表的なアルゴリズムの精細な一般化限界を導出します。我々の結果は、アルゴリズム的安定性、一般化ギャップ、および実用的な設定間の明確なトレードオフを明らかにしました。さらに、広範な実用的評価は、二段階最小化最大化構造を持つ実世界の最適化タスクにおける我々の理論的情報を実証的に裏付けました。

Original Content

arXiv:2604.20115v1 Announce Type: new Abstract: Bilevel optimization and bilevel minimax optimization have recently emerged as unifying frameworks for a range of machine-learning tasks, including hyperparameter optimization and reinforcement learning. The existing literature focuses on empirical efficiency and convergence guarantees, leaving a critical theoretical gap in understanding how well these algorithms generalize. To bridge this gap, we provide the first systematic generalization analysis for first-order gradient-based bilevel minimax solvers with lower-level minimax problems. Specifically, by leveraging algorithmic stability arguments, we derive fine-grained generalization bounds for three representative algorithms, including single-timescale stochastic gradient descent-ascent, and two variants of two-timescale stochastic gradient descent-ascent. Our results reveal a precise trade-off among algorithmic stability, generalization gaps, and practical settings. Furthermore, extensive empirical evaluations corroborate our theoretical insights on realistic optimization tasks with bilevel minimax structures.