arxiv_cs_cv 2026年2月10日

不整合表現学習に基づくカウスのクロスカメラ識別

Cross-Camera Cow Identification via Disentangled Representation Learning

Translated: 2026/3/15 18:04:32

calf-identificationmachine-learningcross-camerasmart-farmingrepresentation-learning

Japanese Translation

arXiv:2602.07566v1 発表型：新規要約：スマート畜舎における包括的なデジタル管理のための個体別の牛の精密な識別は、基本的な前提条件です。既存の動物識別方法は、制御された単一カメラの設定では優れていますが、クロスカメラでの汎用性には重大な課題を抱えています。源カメラで訓練されたモデルを、異なる照明、背景、視点、および異質的な撮像特性を持つ新規監視ノードに展開すると、認識性能が劇的に低下する傾向があります。これは、動的で現実の農場環境における非接触技術の大規模な応用を制限します。この課題に対処するため、本研究では不整合表現学習に基づくクロスカメラ牛識別フレームワークを提案します。このフレームワークは、牛の視認性認識の文脈における子空間識別性保証（SIG）理論を活用しています。背後にある物理的なデータ生成プロセスをモデル化することで、我々は観測された画像を複数の正交latentサブスペースに分解する原理主導の機能不整合モジュールを設計しました。このメカニズムは、カメラを超えて不変であるアイデンティティに関連する生体特徴を効果的に分離するものであり、未見カメラへの汎用性を大幅に向上させます。5 つの異なるカメラノード、異質的な取得装置、および複雑な照明と角度の変動をカバーする高品質なデータセットを構築しました。7 つのクロスカメラタスクにわたる広範な実験において、提案された方法は平均精度で 86.0% を達成し、源のみベースライン（51.9%）および強力なクロスカメラベースライン方法（79.8%）に著しく優れています。本研究は、協調クロスカメラ牛識別のための子空間理論的機能不整合フレームワークを確立し、制御されていないスマート農場環境における精密な動物監視の新たなパラダイムを提供します。

Original Content

arXiv:2602.07566v1 Announce Type: new Abstract: Precise identification of individual cows is a fundamental prerequisite for comprehensive digital management in smart livestock farming. While existing animal identification methods excel in controlled, single-camera settings, they face severe challenges regarding cross-camera generalization. When models trained on source cameras are deployed to new monitoring nodes characterized by divergent illumination, backgrounds, viewpoints, and heterogeneous imaging properties, recognition performance often degrades dramatically. This limits the large-scale application of non-contact technologies in dynamic, real-world farming environments. To address this challenge, this study proposes a cross-camera cow identification framework based on disentangled representation learning. This framework leverages the Subspace Identifiability Guarantee (SIG) theory in the context of bovine visual recognition. By modeling the underlying physical data generation process, we designed a principle-driven feature disentanglement module that decomposes observed images into multiple orthogonal latent subspaces. This mechanism effectively isolates stable, identity-related biometric features that remain invariant across cameras, thereby substantially improving generalization to unseen cameras. We constructed a high-quality dataset spanning five distinct camera nodes, covering heterogeneous acquisition devices and complex variations in lighting and angles. Extensive experiments across seven cross-camera tasks demonstrate that the proposed method achieves an average accuracy of 86.0%, significantly outperforming the Source-only Baseline (51.9%) and the strongest cross-camera baseline method (79.8%). This work establishes a subspace-theoretic feature disentanglement framework for collaborative cross-camera cow identification, offering a new paradigm for precise animal monitoring in uncontrolled smart farming environments.