arxiv_cs_cv 2026年4月24日

AITP: マルティモーダル大規模言語モデルを用いた交通事故責任割当

AITP: Traffic Accident Responsibility Allocation via Multimodal Large Language Models

Translated: 2026/4/24 19:47:46

multimodal-large-language-modelstraffic-accident-analysislegal-knowledge-integrationreasoning-benchmarksretrieval-augmented-generation

Japanese Translation

arXiv:2604.20878v1 Announce Type: cross 摘要：マルチモーダル大規模言語モデル（MLLM）は、交通事故検出（TAD）および交通事故理解（TAU）において驚くべき進歩を遂げています。しかし、既存の研究は主に事故映像の記述と解釈に焦点を当てており、より深い因果推論と法的知識の統合に余地を残しています。交通事故責任割当（TARA）は、道路交通法の根拠に基づく多段階推論を必要とするより困難なタスクです。これを解決するために、AITP（Artificial Intelligence Traffic Police：人工知能交通警察）という、責任推論と割当用のマルチモーダル大規模言語モデルを導入します。AITP はマルチモーダルコード・オブ・ザ・シンチ（MCoT）メカニズムによる推論強化と、検索拡張生成（RAG）を通じた法的知識の統合を実現します。さらに、10 つの関連する交通事故推論タスクを統合し、67,941 件の注釈付き映像と 195,821 件の質問応答ペアを持つ、デカスーン方式のベンチマーク「DecaTARA」を提示します。大規模な実験は、AITP が責任割当、TAD、TAU タスクにおいて最先端の性能を実証し、推論駆動型のマルチモーダル交通解析の新たなパラダイムを確立したことを示しています。

Original Content

arXiv:2604.20878v1 Announce Type: cross Abstract: Multimodal Large Language Models (MLLMs) have achieved remarkable progress in Traffic Accident Detection (TAD) and Traffic Accident Understanding (TAU). However, existing studies mainly focus on describing and interpreting accident videos, leaving room for deeper causal reasoning and integration of legal knowledge. Traffic Accident Responsibility Allocation (TARA) is a more challenging task that requires multi-step reasoning grounded in traffic regulations. To address this, we introduce AITP (Artificial Intelligence Traffic Police), a multimodal large language model for responsibility reasoning and allocation. AITP enhances reasoning via a Multimodal Chain-of-Thought (MCoT) mechanism and integrates legal knowledge through Retrieval-Augmented Generation (RAG). We further present DecaTARA, a decathlon-style benchmark unifying ten interrelated traffic accident reasoning tasks with 67,941 annotated videos and 195,821 question-answer pairs. Extensive experiments show that AITP achieves state-of-the-art performance across responsibility allocation, TAD, and TAU tasks, establishing a new paradigm for reasoning-driven multimodal traffic analysis.