arxiv_cs_ai 2026年4月24日

TRAVELFRAUDBENCH: GNN による旅行ネットワーク上の不正団体検出への構成可能な評価フレームワーク

TRAVELFRAUDBENCH: A Configurable Evaluation Framework for GNN Fraud Ring Detection in Travel Networks

Translated: 2026/4/24 20:23:29

travel-fraudgraph-neural-networksfraud-detectionbenchmark-evaluationheterogeneous-graphs

Japanese Translation

arXiv:2604.21093v1 Announce Type: cross 要約: 私たちは、旅行プラットフォームグラフにおける不正団体検出に GNN (Graph Neural Networks) を評価するための構成可能なベンチマークである TravelFraudBench (TFG) を導入しました。既存のベンチマーク（YelpChi, Amazon-Fraud, Elliptic, PaySim）は単一のノードタイプやドメイン汎用パターンのみをカバーしており、構造的に異なる不正団体トポロジーを横断して評価するメカニズムを持っていません。TFG は、異性グラフ (9 ノードタイプ、12 辺タイプ) 上で、チケット不正 (共有されたデバイス/IP クラスターを持つ星型トポロジー)、ゴーストホテル計画 (レビュー者 x ホテルの二分法完全グラフ)、およびアカウント転移リング (忠誠度移転チェーン) の 3 つの旅行固有のリングタイプをシミュレーションします。リングサイズ、数量、不正率、スケール (500〜200,000 ノード)、および組成は完全に構成可能です。私たちは、各リングが 1 つのパリションに完全に存在することで誘導的ラベル漏洩を排除するリングベースのスプリット条件下で、6 つの方法（MLP, GraphSAGE, RGCN-proj, HAN, RGCN, PC-GNN）を評価しました。GraphSAGE (AUC=0.992) と RGCN-proj (AUC=0.987) は、MLP ベースライン (AUC=0.938) をそれぞれ 5.5 pp と 5.0 pp 上回り、グラフ構造が顕著な識別力を付与することを確認しました。HAN (AUC=0.935) は負の結果であり、MLP ベースラインと一致しました。リング回復タスク (80%以上のリングメンバーが同時にフラグ付けされた場合) では、GraphSAGE はすべてのリングタイプで 100% の回復率を達成し、MLP は 17-88% です。辺タイプのアブレーション実験では、デバイスと IP の共起が主要なシグナルであることを示唆：uses_device を除外すると AUC が 5.2 pp 低下します。TFG は PyG、DGL、NetworkX エクスポート器を備えたオープンソース Python パッケージとして公開され (MIT ライセンス)、https://huggingface.co/datasets/bsajja7/travel-fraud-graphs に事前生成されたデータセットを含み、Croissant メタデータに責任ある AI フィールドが含まれています。

Original Content

arXiv:2604.21093v1 Announce Type: cross Abstract: We introduce TravelFraudBench (TFG), a configurable benchmark for evaluating graph neural networks (GNNs) on fraud ring detection in travel platform graphs. Existing benchmarks--YelpChi, Amazon-Fraud, Elliptic, PaySim--cover single node types or domain-generic patterns with no mechanism to evaluate across structurally distinct fraud ring topologies. TFG simulates three travel-specific ring types--ticketing fraud (star topology with shared device/IP clusters), ghost hotel schemes (reviewer x hotel bipartite cliques), and account takeover rings (loyalty transfer chains)--in a heterogeneous graph with 9 node types and 12 edge types. Ring size, count, fraud rate, scale (500 to 200,000 nodes), and composition are fully configurable. We evaluate six methods--MLP, GraphSAGE, RGCN-proj, HAN, RGCN, and PC-GNN--under a ring-based split where each ring appears entirely in one partition, eliminating transductive label leakage. GraphSAGE achieves AUC=0.992 and RGCN-proj AUC=0.987, outperforming the MLP baseline (AUC=0.938) by 5.5 and 5.0 pp, confirming graph structure adds substantial discriminative power. HAN (AUC=0.935) is a negative result, matching the MLP baseline. On the ring recovery task (>=80% of ring members flagged simultaneously), GraphSAGE achieves 100% recovery across all ring types; MLP recovers only 17-88%. The edge-type ablation shows device and IP co-occurrence are the primary signals: removing uses_device drops AUC by 5.2 pp. TFG is released as an open-source Python package (MIT license) with PyG, DGL, and NetworkX exporters and pre-generated datasets at https://huggingface.co/datasets/bsajja7/travel-fraud-graphs, with Croissant metadata including Responsible AI fields.