arxiv_cs_lg 2026年2月10日

Bregman 調整されたディープネットワークを用いたオンラインベイズ不均衡学習

Online Bayesian Imbalanced Learning with Bregman-Calibrated Deep Networks

Translated: 2026/3/15 15:02:24

online-mlbayesian-inferencedeep-learningclass-imbalanceregret-bounds

Japanese Translation

論文情報：arXiv:2602.08128v1 発表タイプ：新しい要約：クラス的不均衡は、少数クラスにおいて標準分類器が性能を劇的に低下させるという機械学習における基本的な課題の一つです。既存のアプローチは、トレーニング段階での再サンプリングやコスト感積学習を通じて不均衡に対処していますが、実運用時にクラス分布が変化した場合、これは一般的な実世界アプリケーション（例：不正検出、医療診断、異常検出）に発生するため、モデルの再トレーニングやラベル付けされたターゲットデータのアクセスを必要とし、実用的な課題となります。本研究では、確率的推論と事前仮定を分離させることで、モデルの再トレーニングなしに分布シフトに対してリアルタイムに適応可能にする原理的なフレームワークである「オンラインベイズ不均衡学習」（OBIL）を提案します。私たちのアプローチは、Bregman 距離と適切なスコアリング則間の確立されたつながりに基づき、そのような損失関数で訓練されたディープネットワークが、事前不変の尤度比を抽出可能な後方確率密度の推定値を生み出すことを示します。私どもは、これらの尤度比推定値が任意のクラス事前確率やコスト構造の変化下でも有効であることを証明し、最適なベイズ決定のために単に閾値調整を行うだけで十分であることを見出しました。また、事前知識を持つオラクルに対して $O(\sqrt{T \log T})$ のルセットを達成することを示す有限サンプルのルセット界限を導出しました。ベンチマークデータセットおよびシミュレートされた運用シフト下での医療診断ベンチマークにおける広範な実験結果は、OBIL が激しい分布シフト下でも堅牢な性能を維持し、テスト分布がトレーニング条件から著しく異なる場合、最先端の方法よりも F1 スコアにおいて優れていることを示しています。

Original Content

arXiv:2602.08128v1 Announce Type: new Abstract: Class imbalance remains a fundamental challenge in machine learning, where standard classifiers exhibit severe performance degradation in minority classes. Although existing approaches address imbalance through resampling or cost-sensitive learning during training, they require retraining or access to labeled target data when class distributions shift at deployment time, a common occurrence in real-world applications such as fraud detection, medical diagnosis, and anomaly detection. We present \textit{Online Bayesian Imbalanced Learning} (OBIL), a principled framework that decouples likelihood-ratio estimation from class-prior assumptions, enabling real-time adaptation to distribution shifts without model retraining. Our approach builds on the established connection between Bregman divergences and proper scoring rules to show that deep networks trained with such losses produce posterior probability estimates from which prior-invariant likelihood ratios can be extracted. We prove that these likelihood-ratio estimates remain valid under arbitrary changes in class priors and cost structures, requiring only a threshold adjustment for optimal Bayes decisions. We derive finite-sample regret bounds demonstrating that OBIL achieves $O(\sqrt{T \log T})$ regret against an oracle with perfect prior knowledge. Extensive experiments on benchmark datasets and medical diagnosis benchmarks under simulated deployment shifts demonstrate that OBIL maintains robust performance under severe distribution shifts, outperforming state-of-the-art methods in F1 Score when test distributions deviate significantly from the training conditions.