arxiv_cs_ai 2026年4月24日

ブラックボックスの境界線を設定する: AI リスク規制のための統計的認証枠組み

Bounding the Black Box: A Statistical Certification Framework for AI Risk Regulation

Translated: 2026/4/24 20:18:49

artificial-intelligencerisk-regulationstatistical-certificationaviation-safetymachine-learning

Japanese Translation

arXiv:2604.21854v1 発表タイプ: 新しい要約：人工知能は現在、融資を受けられるかどうか、刑事調査へのフラグ表示、自律走行車のブレーキ応答などを決定しています。各国政府は EU AI アクツ、NIST リスク管理枠組み、欧州議会条約を通じて、高リスクシステムの安全証明を前提としたデプロイを求めると答えています。しかし、この規制の合意の下層には決定的な欠落が存在します：「受容可能なリスク」の定量的な定義が指定されておらず、デプロイされたシステムが実際にその閾値を満たしていることを技術的に検証する手法も提供されていません。規制のア키택チャーは整っています；検証の儀具は整っておりません。このギャップは理論的ではありません。EU AI アクツが完全な執行段階に入ったとき、開発者は確立された方法論なしに義務的な適合性評価を行うことになり、かつ最も監督が必要なシステムは白く箱の scrutiny に抵抗する不透明な統計的推論エンジンのものです。この論文は欠けている儀具を提供します。航空の認証パラダイムに基づき、我々は AI リスク規制をエンジニアリングプラクティスに変換する 2 段階の枠組みを提案します。第 1 段階では、有能な権限は受容可能な故障確率 δ とオペレーションの入力領域 ε を形式的に固定し、これは直接の市民法責任の帰属を伴う規範的行為です。第 2 段階では、RoMA と gRoMA 統計的検証ツールは、モデルの内部アクセスを必要とせず、任意のア키택チャーにスケーリングしてシステムの真の故障率に関する確定的な、監査可能な上限値を計算します。我々は、この証明書が現存する規制の義務を満たし、責任を開発家に上流へ移行させ、今日存在する法的枠組みと統合する方法を示します。

Original Content

arXiv:2604.21854v1 Announce Type: new Abstract: Artificial intelligence now decides who receives a loan, who is flagged for criminal investigation, and whether an autonomous vehicle brakes in time. Governments have responded: the EU AI Act, the NIST Risk Management Framework, and the Council of Europe Convention all demand that high-risk systems demonstrate safety before deployment. Yet beneath this regulatory consensus lies a critical vacuum: none specifies what ``acceptable risk'' means in quantitative terms, and none provides a technical method for verifying that a deployed system actually meets such a threshold. The regulatory architecture is in place; the verification instrument is not. This gap is not theoretical. As the EU AI Act moves into full enforcement, developers face mandatory conformity assessments without established methodologies for producing quantitative safety evidence - and the systems most in need of oversight are opaque statistical inference engines that resist white-box scrutiny. This paper provides the missing instrument. Drawing on the aviation certification paradigm, we propose a two-stage framework that transforms AI risk regulation into engineering practice. In Stage One, a competent authority formally fixes an acceptable failure probability $\delta$ and an operational input domain $\varepsilon$ - a normative act with direct civil liability implications. In Stage Two, the RoMA and gRoMA statistical verification tools compute a definitive, auditable upper bound on the system's true failure rate, requiring no access to model internals and scaling to arbitrary architectures. We demonstrate how this certificate satisfies existing regulatory obligations, shifts accountability upstream to developers, and integrates with the legal frameworks that exist today.