arxiv_cs_cv 2026年4月20日

HyperGVL：ハイパーグラフの理解と論理における大規模ビジョン言語モデルのベンチマーク化と改善

HyperGVL: Benchmarking and Improving Large Vision-Language Models in Hypergraph Understanding and Reasoning

Translated: 2026/4/20 10:47:53

hypergraphvision-language-modelsbenchmarkingneural-architecture-searchreasoning

Japanese Translation

arXiv:2604.15648v1 Announce Type: cross 要約: 大規模ビジョン言語モデル（LVLM）はその能力の拡張を導くために常に新しい分野を必要とする一方で、ハイパーグラフにおける能力はまだ未探索の状態にある。現実の世界では、ライフサイエンスやソーシャルコミュニティなどの分野においてハイパーグラフは大きな実用的応用を持つ。最近の LVLM の進歩が複雑なトポロジーの理解における可能性を示したことは確かに事実だが、ハイパーグラフにおける LVLM の能力を記述するベンチマークが不足しており、それらの能力の限界が不明確である。このギャップを埋めるため、本稿では、ハイパーグラフの理解と論理における LVLM の熟練度を評価する最初のベンチマークである $ exttt{HyperGVL}$ を導入した。$ exttt{HyperGVL}$ は、単純な構成要素の点数計から複雑な NP 困難問題の論理推論まで含む、84,000 例のビジョン言語質問回答（QA）サンプルを跨ぐ 12 のタスクにおいて 12 の高度な LVLM の包括的な評価を提供する。対象とするハイパーグラフには、多尺度の合成構造と、リアルワールドの引用ネットワークおよびタンパク質ネットワークが含まれている。さらに、12 種類のテキストおよび視覚的なハイパーグラフ表現の効果を検討し、LVLM をハイパーグラフにおいて改善する適応的表現を学習する汎用的なルーティングモデル $ exttt{WiseHyGR}$ を導入した。我々は、この仕事がハイパーグラフと LVLM を結びつける上で前進であると信じている。

Original Content

arXiv:2604.15648v1 Announce Type: cross Abstract: Large Vision-Language Models (LVLMs) consistently require new arenas to guide their expanding boundaries, yet their capabilities with hypergraphs remain unexplored. In the real world, hypergraphs have significant practical applications in areas such as life sciences and social communities. Recent advancements in LVLMs have shown promise in understanding complex topologies, yet there remains a lack of a benchmark to delineate the capabilities of LVLMs with hypergraphs, leaving the boundaries of their abilities unclear. To fill this gap, in this paper, we introduce $\texttt{HyperGVL}$, the first benchmark to evaluate the proficiency of LVLMs in hypergraph understanding and reasoning. $\texttt{HyperGVL}$ provides a comprehensive assessment of 12 advanced LVLMs across 84,000 vision-language question-answering (QA) samples spanning 12 tasks, ranging from basic component counting to complex NP-hard problem reasoning. The involved hypergraphs contain multiscale synthetic structures and real-world citation and protein networks. Moreover, we examine the effects of 12 textual and visual hypergraph representations and introduce a generalizable router $\texttt{WiseHyGR}$ that improves LVLMs in hypergraph via learning adaptive representations. We believe that this work is a step forward in connecting hypergraphs with LVLMs.