arxiv_cs_ai 2026年4月24日

Alignment の Fantasia 問題

Alignment has a Fantasia Problem

Translated: 2026/4/24 20:18:43

alignmentai-assistantsfantasia-interactionsbehavioral-sciencehuman-ai-collaboration

Japanese Translation

arXiv:2604.21827v1 Announce Type: new Abstract: 最新の AI アシスタントは、ユーザーが明確に自分の目標と必要な援助内容を伝えることができると仮定して指示順応を学習しています。しかし、数十年にわたる行動研究では、人々が自分の目標が完全に形成される前に AI システムと関わることを示しています。AI システムがプロンプトを意図の完全な表現として扱うと、有用に見えるか便利に見えるかもしれませんが、必ずしもユーザーのニーズと一致するとは限りません。私たちはこれらの失敗を「Fantasia 相互作用」と呼びます。私たちは、Fantasia 相互作用がアライメント研究の再考を要求すると主張します。ユーザーを合理的なオーラクルとして扱うのではなく、時間を通じて意図的形成と洗練を積極的に支援することで、AI は認知的サポートを提供すべきです。これには、機械学習、インターフェース設計、行動科学を橋渡しする学際的アプローチが必要です。私たちはこれらの分野からの知見を統合し、Fantasia 相互作用のメカニズムと失敗の特性を特徴付けました。次に、既存の介入が不十分である理由を示し、人間がタスクにおける不確実性をよりよくナビゲートするお手伝いをする AI システムの設計と評価のための研究議程を提案しました。

Original Content

arXiv:2604.21827v1 Announce Type: new Abstract: Modern AI assistants are trained to follow instructions, implicitly assuming that users can clearly articulate their goals and the kind of assistance they need. Decades of behavioral research, however, show that people often engage with AI systems before their goals are fully formed. When AI systems treat prompts as complete expressions of intent, they can appear to be useful or convenient, but not necessarily aligned with the users' needs. We call these failures Fantasia interactions. We argue that Fantasia interactions demand a rethinking of alignment research: rather than treating users as rational oracles, AI should provide cognitive support by actively helping users form and refine their intent through time. This requires an interdisciplinary approach that bridges machine learning, interface design, and behavioral science. We synthesize insights from these fields to characterize the mechanisms and failures of Fantasia interactions. We then show why existing interventions are insufficient, and propose a research agenda for designing and evaluating AI systems that better help humans navigate uncertainty in their tasks.