arxiv_cs_cv 2026年2月10日

CoBEVMoE: ヘテロジェニシティアWAREな特徴融合と動的 Mixture-of-Experts を活用した協調感知

CoBEVMoE: Heterogeneity-aware Feature Fusion with Dynamic Mixture-of-Experts for Collaborative Perception

Translated: 2026/3/15 13:02:32

cobevmoEmixed-of-expertsheterogeneous-fusionbeV-perceptionv2x-systems

Japanese Translation

arXiv:2509.17107v2 Announce Type: replace 摘要：協調感知は、複数のエージェント間で情報を共有することで感測範囲を拡大し、感知精度を向上させることを目的としています。しかし、視点や位置の違いにより、エージェントはヘテロジェニティな観測データを取得する傾向にあります。既存の中間融合手法は主に類似した特徴を揃えることに焦点を当てており、エージェント間の特異な感知多様性を無視してしまっています。この限界に対処するため、私たちは CoBEVMoE という新しい協調感知フレームワークを提案しました。CoBEVMoE はビークロスビュー（BEV）空間で動作し、動的 Mixture-of-Experts（DMoE）アーキテクチャを統合しています。DMoE において、各エキスパートは特定のエージェントの入力特徴に基づいて動的に生成され、特異かつ信頼性の高いクイを抽出し、同時に共有 semantics に注意を向けます。この設計により、融合プロセスではエージェント間の特徴の類似性とヘテロジェニティを明示的にモデル化できます。さらに、エキスパート間の多様性を高め、融合された表現の識別性を向上させるために、動的エキスパートメトリック損失（DEML）を導入しました。OPV2V と DAIR-V2X-C データセットにおける大規模な実験において、CoBEVMoE が最先进的のパフォーマンスを示したことを示すことができます。具体的には、カメラベースの BEV セグメンテーションにおける IoU は OPV2V で +1.5%、LiDAR ベースの 3D オブジェクト検出における AP@0.5 は DAIR-V2X-C で +3.0% 改善されました。これは、マルチエージェント協調感知におけるエキスパートベースのヘテロジェニティ特徴モデルの有効性を裏付けています。ソースコードは https://github.com/godk0509/CoBEVMoE で公開されます。

Original Content

arXiv:2509.17107v2 Announce Type: replace Abstract: Collaborative perception aims to extend sensing coverage and improve perception accuracy by sharing information among multiple agents. However, due to differences in viewpoints and spatial positions, agents often acquire heterogeneous observations. Existing intermediate fusion methods primarily focus on aligning similar features, often overlooking the perceptual diversity among agents. To address this limitation, we propose CoBEVMoE, a novel collaborative perception framework that operates in the Bird's Eye View (BEV) space and incorporates a Dynamic Mixture-of-Experts (DMoE) architecture. In DMoE, each expert is dynamically generated based on the input features of a specific agent, enabling it to extract distinctive and reliable cues while attending to shared semantics. This design allows the fusion process to explicitly model both feature similarity and heterogeneity across agents. Furthermore, we introduce a Dynamic Expert Metric Loss (DEML) to enhance inter-expert diversity and improve the discriminability of the fused representation. Extensive experiments on the OPV2V and DAIR-V2X-C datasets demonstrate that CoBEVMoE achieves state-of-the-art performance. Specifically, it improves the IoU for Camera-based BEV segmentation by +1.5% on OPV2V and the AP@0.5 for LiDAR-based 3D object detection by +3.0% on DAIR-V2X-C, verifying the effectiveness of expert-based heterogeneous feature modeling in multi-agent collaborative perception. The source code will be made publicly available at https://github.com/godk0509/CoBEVMoE.