arxiv_cs_ai 2026年2月10日

CausalT5K: めいがくていりょうに対する疑問への対応診断と情報を提供する、信頼できる因果リテラシーの理解を促進

CausalT5K: Diagnosing and Informing Refusal for Trustworthy Causal Reasoning of Skepticism, Sycophancy, Detection-Correction, and Rung Collapse

Open original article

Translated: 2026/3/7 11:19:21

causal-reasoningLLM-failuretrustworthy-machine-learningcognitive-narrativdiagnostic-benchmark

Japanese Translation

LLMによる因果性の判断においては、クーポインティングの問題やラング・コラプシオンの問題など、疑問に対する不十分な反応といった失敗例が一般的に知られています。しかし対策に関してはまだ発展途上です。このため、適切な評価基盤がないためです。我々は、十つの異なるトピックに対して合計5,000件以上の症例を対象とした診断基盤であるCausalT5Kを開発しました。これは以下の3つの重要な能力をテストするように設計されています： 1.ラング・コラプシオンの検出、モデルが介入に関する質問に相関的証拠のみを使って答えていることを指摘します。2は強調された自己賞賛の変形に対する反応についてテストします。それに引き続き、3番目はエビデンスがある状況ではなかったかぎり特定の情報を提供する「賢明な拒否」を生成するという能力です。CausalT5Kのような合成型基盤とは異なり、CausalT5Kでは具体的なシナリオに使われているカウストラップが埋め込まれており、性能は効用（感度）と安全（特異度）という角度によって分解されます。CausalT5Kはアルフレード・ペールの駆上のカウンターとして研究基盤を提供します。我々の実験では四つの quadrant の制御地形が明らかとなり、一般的なステイティカルの評価政策は全て失敗することを示しました。これはCausalT5Kの信頼できる推論システムへの移行に導くことを証明しています。“https://github.com/genglongling/CausalT5KBench”というリポジトリで取得できます。

Original Content

arXiv:2602.08939v1 Announce Type: new Abstract: LLM failures in causal reasoning, including sycophancy, rung collapse, and miscalibrated refusal, are well-documented, yet progress on remediation is slow because no benchmark enables systematic diagnosis. We introduce CausalT5K, a diagnostic benchmark of over 5,000 cases across 10 domains that tests three critical capabilities: (1) detecting rung collapse, where models answer interventional queries with associational evidence; (2) resisting sycophantic drift under adversarial pressure; and (3) generating Wise Refusals that specify missing information when evidence is underdetermined. Unlike synthetic benchmarks, CausalT5K embeds causal traps in realistic narratives and decomposes performance into Utility (sensitivity) and Safety (specificity), revealing failure modes invisible to aggregate accuracy. Developed through a rigorous human-machine collaborative pipeline involving 40 domain experts, iterative cross-validation cycles, and composite verification via rule-based, LLM, and human scoring, CausalT5K implements Pearl's Ladder of Causation as research infrastructure. Preliminary experiments reveal a Four-Quadrant Control Landscape where static audit policies universally fail, a finding that demonstrates CausalT5K's value for advancing trustworthy reasoning systems. Repository: https://github.com/genglongling/CausalT5kBench