arxiv_cs_ai 2026年4月20日

VLegal-Bench: 大規模言語モデルのベトナム法理論の認知基盤に優れたベンチマーク

VLegal-Bench: Cognitively Grounded Benchmark for Vietnamese Legal Reasoning of Large Language Models

Translated: 2026/4/20 11:18:22

legal-benchmarkslarge-language-modelsvietnamese-lawartificial-intelligencecognitive-evaluation

Japanese Translation

arXiv:2512.14554v5 発表タイプ：置換クロスサマリー：大規模言語モデル（LLM）の急速な進展は、人工知能を法律分野に適用する新たな可能性をもたらしました。しかし、ベトナムの法律は複雑で、階層構造を持ち、頻繁に改訂されるため、これらのモデルが法律知識をどのように解釈し活用できるかを評価するには多大な課題が存在します。このギャップを解決するため、ベトナム法ベンチマーク（VLegal-Bench）が導入され、これは最初に大規模言語モデルをベトナム法のタスクにおいて体系的に評価するために設計された包括的なベンチマークです。ブールの認知分類法に基づいた VLegal-Bench は、実践的な使用シナリオを反映するように設計されたタスクを通じて、複数の法律理解レベルを備えています。このベンチマークは、厳格な注釈パイプラインを通じて生成された 10,450 件のサンプルを包括し、法的専門家が私たちの注釈システムを使用して各インスタンスをラベル付けおよびクロス検証し、すべてのサンプルが公式な法的文書に基づいており、ベトナム法に合わせた一般的法律質問と回答、検索拡張生成、複数ステップの推論、シナリオベースの課題解決を含む現実世界の法的アシスタントワークフローを反映しています。標準化され、透明性があり、認知に基づいた評価フレームワークを提供することで、VLegal-Bench は大規模言語モデルのベトナム法文脈におけるパフォーマンスを評価するための堅固な基礎を確立し、より信頼性が高く、解釈可能で、倫理的に整合性のある AI 支援の法的システムの開発をサポートします。アクセス性と再現性を促進するため、このベンチマークのための公共ランディングページを https://vilegalbench.cmcai.vn/ で提供します。

Original Content

arXiv:2512.14554v5 Announce Type: replace-cross Abstract: The rapid advancement of large language models (LLMs) has enabled new possibilities for applying artificial intelligence within the legal domain. Nonetheless, the complexity, hierarchical organization, and frequent revisions of Vietnamese legislation pose considerable challenges for evaluating how well these models interpret and utilize legal knowledge. To address this gap, the Vietnamese Legal Benchmark (VLegal-Bench) is introduced, the first comprehensive benchmark designed to systematically assess LLMs on Vietnamese legal tasks. Informed by Bloom's cognitive taxonomy, VLegal-Bench encompasses multiple levels of legal understanding through tasks designed to reflect practical usage scenarios. The benchmark comprises 10,450 samples generated through a rigorous annotation pipeline, where legal experts label and cross-validate each instance using our annotation system to ensure every sample is grounded in authoritative legal documents and mirrors real-world legal assistant workflows, including general legal questions and answers, retrieval-augmented generation, multi-step reasoning, and scenario-based problem solving tailored to Vietnamese law. By providing a standardized, transparent, and cognitively informed evaluation framework, VLegal-Bench establishes a solid foundation for assessing LLM performance in Vietnamese legal contexts and supports the development of more reliable, interpretable, and ethically aligned AI-assisted legal systems. To facilitate access and reproducibility, we provide a public landing page for this benchmark at https://vilegalbench.cmcai.vn/.