arxiv_cs_ai 2026年4月24日

ドキュメントからのオープンドメインイベント抽出のためのモーダルなテキストおよびグラフに基づくアプローチ

A Multimodal Text- and Graph-Based Approach for Open-Domain Event Extraction from Documents

Translated: 2026/4/24 20:29:12

event-extractionlarge-language-modelsmultimodal-learninggraph-based-neural-networksdocument-understanding

Japanese Translation

arXiv:2604.21885v1 Announce Type: cross 要旨：イベント抽出はイベント理解・分析において不可欠であり、ドキュメント要約や緊急性の高い状況における意思決定支援などのタスクを支えています。ただし、既存のイベント抽出アプローチには以下のような限界があります：(1) クローズドドメインアルゴリズムは事前定義されたイベントタイプに限定されており、見知らぬタイプへの一般化がほとんどなく、(2) 制約のないイベントタイプを処理できるオープンドメインイベント抽出アルゴリズムは、高度な能力を持つ大規模言語モデル（LLMs）の可能性に大きく見落としている点があります。加えて、それらはドキュメントレベルの文脈、構造的、推論的な要素を明示的にモデル化していません。これらの要素は効果的なイベント抽出において不可欠ですが、LLMs では「途中落としてしまう現象」というAttention dilutionによって処理が困難です。これらの限界を解消するために、私たちは、グラフベースの学習とLLMsからのテキストベース表現を融合してドキュメントレベルの推論をモデル化する、オープンドメインイベント抽出のための新たなアプローチであるモーダルなオープンドメインイベント抽出（MODEE）を提案します。大規模データセットにおける実証評価では、MODEE はステータストイオスのオープンドメインイベント抽出アプローチを凌駕し、さらにその一般化能力をClosed-domainのイベント抽出へ拡張し、既存のアルゴリズムを凌駕する性能を達成したことを示しました。

Original Content

arXiv:2604.21885v1 Announce Type: cross Abstract: Event extraction is essential for event understanding and analysis. It supports tasks such as document summarization and decision-making in emergency scenarios. However, existing event extraction approaches have limitations: (1) closed-domain algorithms are restricted to predefined event types and thus rarely generalize to unseen types and (2) open-domain event extraction algorithms, capable of handling unconstrained event types, have largely overlooked the potential of large language models (LLMs) despite their advanced abilities. Additionally, they do not explicitly model document-level contextual, structural, and semantic reasoning, which are crucial for effective event extraction but remain challenging for LLMs due to lost-in-the-middle phenomenon and attention dilution. To address these limitations, we propose multimodal open-domain event extraction, MODEE , a novel approach for open-domain event extraction that combines graph-based learning with text-based representation from LLMs to model document-level reasoning. Empirical evaluations on large datasets demonstrate that MODEE outperforms state-of-the-art open-domain event extraction approaches and can be generalized to closed-domain event extraction, where it outperforms existing algorithms.