arxiv_cs_lg 2026年2月10日

勝者への呪いがデータ駆動型意思決定における虚偽の約束を招く：難民マッチングに関するケーススタディ

Winner's Curse Drives False Promises in Data-Driven Decisions: A Case Study in Refugee Matching

Translated: 2026/3/15 8:09:56

data-driven-decision-makingwinner's-cursepolicy-evaluationrefugee-matchingcounterfactual-outcomes

Japanese Translation

arXiv:2602.08892v1 発表タイプ：横断要旨：データ駆動型意思決定における主要な課題の一つは、正確な政策評価、すなわち、学習された意思決定政策が約束された利益を達成することを保証することである。一般的な戦略は、モデルベースの政策評価であり、データからモデルを推定して反事実結果を推論する手法である。この戦略は、勝者への呪い（Winner's Curse）のために、真の利益の過大評価を生み出すことが知られている。我々は最近のデータ駆動型意思決定の文献を検索し、過去 10 年に『マネジメントサイエンス』で公開された 55 件の論文の中からサンプルを選んだ。すべての論文の 98% がこの欠陥のある方法論に依存していた。いくつかの一般的な正当化が提供されている：(1) 推定されたモデルは精度が高く、安定しており、よく適合しており、(2) 歴史的データはランダムな処置割り当てを使用し、(3) モデルファミリーは適切に仕様され、(4) 評価方法論はサンプル分離を使用している。残念ながら、我々はこれらの正当化のどれ組みも勝者への呪いを回避することはできないことを示す。まず、これらの正当化がすべて成立したとしても、勝者への呪いが大きな偽の報告された利益を引き起こす可能性を示す理論的分析を提供する。次に、最近かつ重要なデータ駆動型難民マッチング問題に基づいたシミュレーション実験を実行する。我々は実際の設定に密着して適合させながら、何れの割り当て政策もランダム割り当てと比較して期待した就業者数を向上させることができないように設計された合成の難民マッチング環境を構築する。モデルベースの手法は、真の効果がゼロであるにもかかわらず、安定した約 60% の大きな利益を報告するが、これらは文献で報告されている 22-75% の向上と同等である。我々の結果は、モデルベースの評価に対して強力な証拠を提供する。

Original Content

arXiv:2602.08892v1 Announce Type: cross Abstract: A major challenge in data-driven decision-making is accurate policy evaluation-i.e., guaranteeing that a learned decision-making policy achieves the promised benefits. A popular strategy is model-based policy evaluation, which estimates a model from data to infer counterfactual outcomes. This strategy is known to produce unwarrantedly optimistic estimates of the true benefit due to the winner's curse. We searched the recent literature on data-driven decision-making, identifying a sample of 55 papers published in the Management Science in the past decade; all but two relied on this flawed methodology. Several common justifications are provided: (1) the estimated models are accurate, stable, and well-calibrated, (2) the historical data uses random treatment assignment, (3) the model family is well-specified, and (4) the evaluation methodology uses sample splitting. Unfortunately, we show that no combination of these justifications avoids the winner's curse. First, we provide a theoretical analysis demonstrating that the winner's curse can cause large, spurious reported benefits even when all these justifications hold. Second, we perform a simulation study based on the recent and consequential data-driven refugee matching problem. We construct a synthetic refugee matching environment (calibrated to closely match the real setting) but designed so that no assignment policy can improve expected employment compared to random assignment. Model-based methods report large, stable gains of around 60% even when the true effect is zero; these gains are on par with improvements of 22-75% reported in the literature. Our results provide strong evidence against model-based evaluation.