arxiv_cs_ai 2026年4月24日

大規模言語モデルは詐欺検出で人間を凌駕し、動機付けられた投資家からの圧力に対して耐性を持つか

Large Language Models Outperform Humans in Fraud Detection and Resistance to Motivated Investor Pressure

Translated: 2026/4/24 20:31:25

large-language-modelsfraud-detectioninvestment-advisoryhuman-in-the-looparxiv-preprint

Japanese Translation

arXiv:2604.20652v2 Announce Type: replace 抽象要約：人間のフィードバックで訓練された大規模言語モデルは、投資家が詐欺の機会に既に確信している状態で到達した際に詐欺警報を抑制する可能性があるかもしれない。私たちは、7 つのトップ LLM と、正当、高リスク、客観的に詐欺である機会をカバーする 12 つの投資シナリオを対象に事前登録された実験を通じてこれを検証した。3,360 回の AI アドバイソリー対話と、1,201 名の被験者からなる人間の基準 benchmark を組み合わせた。予想とは逆に、動機付けられた投資家によるフレームワークは AI の詐欺警報を抑制しなかった; むしろ、わずかに増加させた可能性がある。承認の転換は 1,000 回の観察のうち 3 回未満であった。人間のアドバイザーはベースラインでの詐欺投資の支援率を 13-14% に設定したが、すべての LLM で 0% であり、圧力条件下では AI の率の 2 倍から 4 倍に達して警告を抑制した。AI システムは、同様のアドバイザーとしての役割において現在の人間よりも一貫した詐欺警告を提供している。

Original Content

arXiv:2604.20652v2 Announce Type: replace Abstract: Large language models trained on human feedback may suppress fraud warnings when investors arrive already persuaded of a fraudulent opportunity. We tested this in a preregistered experiment across seven leading LLMs and twelve investment scenarios covering legitimate, high-risk, and objectively fraudulent opportunities, combining 3,360 AI advisory conversations with a 1,201-participant human benchmark. Contrary to predictions, motivated investor framing did not suppress AI fraud warnings; if anything, it marginally increased them. Endorsement reversal occurred in fewer than 3 in 1,000 observations. Human advisors endorsed fraudulent investments at baseline rates of 13-14%, versus 0% across all LLMs, and suppressed warnings under pressure at two to four times the AI rate. AI systems currently provide more consistent fraud warnings than lay humans in an identical advisory role.