arxiv_cs_lg 2026年4月24日

Sharpness と確信度の両極性：カルイブラーションが曲率を追い抜く時

Too Sharp, Too Sure: When Calibration Follows Curvature

Translated: 2026/4/24 19:59:32

neural-networkscalibrationcurvaturemachine-learningoptimization

Japanese Translation

arXiv:2604.20614v1 Announce Type: new 摘要：現代のニューラルネットワークは高い精度を達成しつつも、十分にカルイブレーション（calibration）されていないままあり、確信度の推計が経験的な正しさと一致しないことがあります。しかし、カルイブラーションはしばしば事後の属性として扱われます。われわれは異なる視点を取り、カルイブラーションを小さな視覚タスクにおけるトレーニング時の現象として研究し、トレーニング手順への介入によってカルイブレーションされた解決策が信頼ably 得られるか否かを問います。われわれは、多様な勾配ベースの方法における深層ネットワークのトレーニングにおいて、カルイブラーション、曲率、マージン間に存在するきつい結合を特定しました。経験的に、期待カルイブラーション誤差 (ECE) は最適化を通じて曲率ベースの sharpness と密接に関連しています。数学的には、われわれは ECE および Gauss--Newton 曲率が、問題固有の定数を除き、経路に沿った margin-dependent exponential tail 関数によって制御されることを示しました。このメカニズムに基づいて、われわれは robust-margin tails と局所的な滑らかさを明示的に目指すマージン aware 学習目標を導入し、精度を犠牲にすることなく様々な最適化器において優れたサンプル外カルイブラーションを実現しました。

Original Content

arXiv:2604.20614v1 Announce Type: new Abstract: Modern neural networks can achieve high accuracy while remaining poorly calibrated, producing confidence estimates that do not match empirical correctness. Yet calibration is often treated as a post-hoc attribute. We take a different perspective: we study calibration as a training-time phenomenon on small vision tasks, and ask whether calibrated solutions can be obtained reliably by intervening on the training procedure. We identify a tight coupling between calibration, curvature, and margins during training of deep networks under multiple gradient-based methods. Empirically, Expected Calibration Error (ECE) closely tracks curvature-based sharpness throughout optimization. Mathematically, we show that both ECE and Gauss--Newton curvature are controlled, up to problem-specific constants, by the same margin-dependent exponential tail functional along the trajectory. Guided by this mechanism, we introduce a margin-aware training objective that explicitly targets robust-margin tails and local smoothness, yielding improved out-of-sample calibration across optimizers without sacrificing accuracy.