arxiv_cs_lg 2026年2月10日

アウト・オブ・ディストリビューションによる一般化を用いたグラフニューラルネットワークの説明品質の定量化

Quantifying Explanation Quality in Graph Neural Networks using Out-of-Distribution Generalization

Translated: 2026/3/15 14:47:10

graph-neural-networksexplainabilityout-of-distributioncausalitygeneralization

Japanese Translation

arXiv:2602.07708v1 発表タイプ：新規摘要：グラフニューラルネットワーク（GNN）の事後説明の評価品質を評価することは依然として大きな課題である。近年では説明可能性手法の開発が進んでいるが、現在の評価指標（例：忠実性、疎性）は、説明が真の因果変数を識別しているか否かを評価することはできない。この問題を解決するために、われわれは説明一般化スコア（EGS）を提案する。これは GNN の説明の因果的相关性を定量化する指標であり、特徴の不変性という原則に基づき、説明が真の因果ドライバーを捉えれば、分布シフトを越えて安定した予測が行われるべきであると仮定する。これを定量化するために、われわれは説明用のサブグラフを用いて GNN を訓練し、アウト・オブ・ディストリビューション（OOD）設定におけるそのパフォーマンスを評価する枠組みを導入した（ここでは、OOD 一般化は説明の因果的有效性を厳密な代替基準とする）。11,200 つのモデル組合せを含む大規模な検証をシナジーおよび実世界のデータセットにわたって実施した結果、われわれの結果は、因果的子構造を捉える能力に基づいて説明子をランク付けするための、原則的な基準である EGS を提供し、従来の忠実性に基づく指標に対する堅固な代替案であることを示した。

Original Content

arXiv:2602.07708v1 Announce Type: new Abstract: Evaluating the quality of post-hoc explanations for Graph Neural Networks (GNNs) remains a significant challenge. While recent years have seen an increasing development of explainability methods, current evaluation metrics (e.g., fidelity, sparsity) often fail to assess whether an explanation identifies the true underlying causal variables. To address this, we propose the Explanation-Generalization Score (EGS), a metric that quantifies the causal relevance of GNN explanations. EGS is founded on the principle of feature invariance and posits that if an explanation captures true causal drivers, it should lead to stable predictions across distribution shifts. To quantify this, we introduce a framework that trains GNNs using explanatory subgraphs and evaluates their performance in Out-of-Distribution (OOD) settings (here, OOD generalization serves as a rigorous proxy for the explanation's causal validity). Through large-scale validation involving 11,200 model combinations across synthetic and real-world datasets, our results demonstrate that EGS provides a principled benchmark for ranking explainers based on their ability to capture causal substructures, offering a robust alternative to traditional fidelity-based metrics.