arxiv_cs_cv 2026年2月10日

Vision Language Modelsにおける表現の失敗の幾何学

The Geometry of Representational Failures in Vision Language Models

Translated: 2026/2/11 13:39:10

Japanese Translation

arXiv:2602.07025v1 発表タイプ: new 要旨: Vision-Language Models (VLMs) は、存在しない要素を幻覚したり、注意をそらす対象の中から最も類似した物体を識別できなかったりするなど、マルチオブジェクト視覚タスクで不可解な失敗を示します。これらのエラーは「Binding Problem」のような人間の認知的制約を反映する一面がありますが、人工システムにおける内部的な駆動メカニズムは十分に理解されていません。本研究では、open-weight VLMs（Qwen, InternVL, Gemma）について表現幾何学を解析することでメカニズム的洞察を提案します。具体的には、視覚的概念を符号化する潜在方向である "concept vectors" を蒸留する手法を比較しました。これらの concept vectors は、steering interventions による介入を通じて検証され、簡略化されたタスクおよび自然に近い視覚タスクの双方でモデルの振る舞いを確実に操作できること（例：モデルに赤い花を青いと知覚させるよう強制する）が示されました。さらに、これらのベクトル間の幾何学的重なりが特定のエラーパターンと強く相関することを観察し、内部表現がモデルの振る舞いをどのように形成し視覚的失敗を引き起こすかを理解するための定量的で実証的な枠組みを提供します。

Original Content

arXiv:2602.07025v1 Announce Type: new Abstract: Vision-Language Models (VLMs) exhibit puzzling failures in multi-object visual tasks, such as hallucinating non-existent elements or failing to identify the most similar objects among distractions. While these errors mirror human cognitive constraints, such as the "Binding Problem", the internal mechanisms driving them in artificial systems remain poorly understood. Here, we propose a mechanistic insight by analyzing the representational geometry of open-weight VLMs (Qwen, InternVL, Gemma), comparing methodologies to distill "concept vectors" - latent directions encoding visual concepts. We validate our concept vectors via steering interventions that reliably manipulate model behavior in both simplified and naturalistic vision tasks (e.g., forcing the model to perceive a red flower as blue). We observe that the geometric overlap between these vectors strongly correlates with specific error patterns, offering a grounded quantitative framework to understand how internal representations shape model behavior and drive visual failures.