arxiv_cs_cv 2026年4月20日

MMGait: モーダル融合型歩行認識のための研究

MMGait: Towards Multi-Modal Gait Recognition

Translated: 2026/4/20 10:44:17

gait-recognitionmulti-modalbiometricscomputer-visiondepth-camera

Japanese Translation

arXiv:2604.15979v1 Announce Type: new Abstract: 歩行認識は、ユーザーの協力を不要にして遠距離から個人を識別するための強力な生体認証技術として台頭しました。既存の多くの方法は RGB 由来のモダリティに主要な焦点を当てており、実世界の多モダリティ共同処理やクロスモダリティ検索が必要なシナリオにおいては不十分です。これらの課題に対処するため、私々は 5 つの異質センサーからデータを統合する包括的な多モダリティ歩行ベンチマーク「MMGait」を提案しました。MMGait は RGB カメラ、深度カメラ、赤外線カメラ、LiDAR スキャナ、および 4D レーダーシステムを含む 12 モダリティおよび 725 被験者からの 334,060 シーケンスを含み、幾何学的、フォトメトリック、および運動領域における系統的な探求を可能にします。MMGait を用いて、私々は単モダリティ、クロスモダリティ、および多モダリティのパラジグムに対して広範な評価を行い、モダリティの頑健性と補完性を分析しました。さらに、上記 3 つの歩行認識パラジグムを単一モデル内で統合することを目的とした新しいタスク「Omni Multi-Modal Gait Recognition」を導入しました。私々はまた、多様なモダリティ間で共有埋め込み空間を学習し、有望な認識性能を実現するシンプルかつ強力なベースライン「OmniGait」も提案しました。MMGait ベンチマーク、コードベース、および事前学習チェックポイントは https://github.com/BNU-IVC/MMGait に公開されています。

Original Content

arXiv:2604.15979v1 Announce Type: new Abstract: Gait recognition has emerged as a powerful biometric technique for identifying individuals at a distance without requiring user cooperation. Most existing methods focus primarily on RGB-derived modalities, which fall short in real-world scenarios requiring multi-modal collaboration and cross-modal retrieval. To overcome these challenges, we present MMGait, a comprehensive multi-modal gait benchmark integrating data from five heterogeneous sensors, including an RGB camera, a depth camera, an infrared camera, a LiDAR scanner, and a 4D Radar system. MMGait contains twelve modalities and 334,060 sequences from 725 subjects, enabling systematic exploration across geometric, photometric, and motion domains. Based on MMGait, we conduct extensive evaluations on single-modal, cross-modal, and multi-modal paradigms to analyze modality robustness and complementarity. Furthermore, we introduce a new task, Omni Multi-Modal Gait Recognition, which aims to unify the above three gait recognition paradigms within a single model. We also propose a simple yet powerful baseline, OmniGait, which learns a shared embedding space across diverse modalities and achieves promising recognition performance. The MMGait benchmark, codebase, and pretrained checkpoints are publicly available at https://github.com/BNU-IVC/MMGait.