arxiv_cs_gr 2026年4月24日

LLM とパノラマ画像を用いた環境文脈による NPC 対話の強化

Empowering NPC Dialogue with Environmental Context Using LLMs and Panoramic Images

Translated: 2026/4/24 19:53:59

llmcomputer-visionsemantic-segmentationnpcgame-ai

Japanese Translation

arXiv:2604.19192v2 Announce Type: replace Abstract: ゲーム中の非プレイヤーキャラクター（NPC）の機能を向上させる手法として、大規模言語モデル（LLM）とコンピュータビジョンを組み合わせて環境への文脈認識を提供する方法を提示します。従来の NPC は事前の台本に依存し、空間理解が不十分であるため、プレイヤー行動への反応が限られ、没入感を低下させる傾向があります。本手法は、NPC の環境のパノラマ画像を取得し、それに対するセマンティックセグメンテーションを適用することで物体とそれらの空間的位置を特定する这一问题に対処します。抽出された情報を使用し、セグメンテーションから導出された物体位置と NPC の境界球体内の追加のシーングラフデータを組み合わせ、方向ベクターとしてエンコードした環境の構造化 JSON 表現を生成します。この表現を LLM の入力として提供し、NPC がプレイヤーとの対話に空間知識を取り込むことを可能にします。その結果、NPC は近傍の物体、ランドマーク、および環境機能を動的に参照することができ、より説得力があり、魅力的なゲームプレイを実現します。本稿ではシステムの技術的実装について説明し、二つの段階で評価を行います。まず、専門家のインタビューを実施し、フィードバックを収集して改善点を特定しました。これらの改善を統合した後、ユーザー研究を実施し、参加者が文脈認識型 NPC を非文脈認識型基準よりも好んだことを示しました。これにより、提案した手法の有効性が確認されました。

Original Content

arXiv:2604.19192v2 Announce Type: replace Abstract: We present an approach for enhancing non-playable characters (NPCs) in games by combining large language models (LLMs) with computer vision to provide contextual awareness of their surroundings. Conventional NPCs typically rely on pre-scripted dialogue and lack spatial understanding, which limits their responsiveness to player actions and reduces overall immersion. Our method addresses these limitations by capturing panoramic images of an NPC's environment and applying semantic segmentation to identify objects and their spatial positions. The extracted information is used to generate a structured JSON representation of the environment, combining object locations derived from segmentation with additional scene graph data within the NPC's bounding sphere, encoded as directional vectors. This representation is provided as input to the LLM, enabling NPCs to incorporate spatial knowledge into player interactions. As a result, NPCs can dynamically reference nearby objects, landmarks, and environmental features, leading to more believable and engaging gameplay. We describe the technical implementation of the system and evaluate it in two stages. First, an expert interview was conducted to gather feedback and identify areas for improvement. After integrating these refinements, a user study was performed, showing that participants preferred the context-aware NPCs over a non-context-aware baseline, confirming the effectiveness of the proposed approach.