arxiv_cs_lg 2026年2月10日

Calibrated Scoresに基づく公平な意思決定：適合性を満たしつつ最適分類を実現する

Fair Decisions from Calibrated Scores: Achieving Optimal Classification While Satisfying Sufficiency

Translated: 2026/3/15 13:06:06

binary-classificationfairnessgroup-calibrationstatistical-paritymachine-learning

Japanese Translation

arXiv:2602.07285v1 発表タイプ: 新作要約：予測確率（スコア）に基づく二値分類は、教師あり機械学習における基本的な課題です。スコープ設定でスコープは Bayes 最適ですが、単一の閾値を使用すると一般に統計的グループの公平性を満たしません。独立性（統計的パラリティ）と分離（等質的オッズ）において、スコアが既に対応する基準を満たす場合、そのような閾値化は十分条件となります。しかし、これらは適合性には拡張されません：完全にグループ対応スコア（および真のクラス確率）でも、閾値化後に予言パラリティを違反します。本稿では、適合性を満たす最適な二値（ランダム化）分類のための正確な解を提示します。これは、有限のグループ対応スコア集合とします。我々は、PPV と FOR（偽棄却率）の達成可能な対の幾何学的特徴付けを提供し、グループ対応スコアとグループ所属のみを使用する最適なクラスフィラーを達成する単純な後処理アルゴリズムを導出します。最後に、適合性と分離は一般的に互いに排他的であるため、適合性に従って分離からの偏差を最小化するクラスフィラーを特定し、それが我々のアルゴリズムによって得られることができ、多くの場合、最適パフォーマンスと比較可能なパフォーマンスを達成することを示します。

Original Content

arXiv:2602.07285v1 Announce Type: new Abstract: Binary classification based on predicted probabilities (scores) is a fundamental task in supervised machine learning. While thresholding scores is Bayes-optimal in the unconstrained setting, using a single threshold generally violates statistical group fairness constraints. Under independence (statistical parity) and separation (equalized odds), such thresholding suffices when the scores already satisfy the corresponding criterion. However, this does not extend to sufficiency: even perfectly group-calibrated scores -- including true class probabilities -- violate predictive parity after thresholding. In this work, we present an exact solution for optimal binary (randomized) classification under sufficiency, assuming finite sets of group-calibrated scores. We provide a geometric characterization of the feasible pairs of positive predictive value (PPV) and false omission rate (FOR) achievable by such classifiers, and use it to derive a simple post-processing algorithm that attains the optimal classifier using only group-calibrated scores and group membership. Finally, since sufficiency and separation are generally incompatible, we identify the classifier that minimizes deviation from separation subject to sufficiency, and show that it can also be obtained by our algorithm, often achieving performance comparable to the optimum.