arxiv_cs_cv 2026年2月10日

背景分布シフト下におけるオープンセットドメイン適応：課題と証明に基づく効率的解決策

Open-Set Domain Adaptation Under Background Distribution Shift: Challenges and A Provably Efficient Solution

Translated: 2026/3/15 17:03:38

open-set-recognitiondomain-adaptationmachine-learningdistribution-shiftco-regularization

Japanese Translation

arXiv:2512.01152v3 Announce Type: replace-cross 摘要：機械学習システムを現実世界に展開する際、核心となる課題は、データがシフトしてもパフォーマンスを維持するモデルを維持することです。このようなシフトには様々な形式があり、トレーニング時には存在しなかった新しいクラスが出現する問題はオープンセット認識と呼ばれます。また、既知カテゴリーの分布が変化することもあります。オープンセット認識に関する保証は、ほとんどの場合、既知カテゴリーの分布（これを背景分布と呼びます）が固定であると仮定されて導かれています。この論文では、背景分布がシフトするにも関わらずオープンセット認識を解決することを保証される CoLOR という手法を開発しました。我々は、新規クラスが新規でないクラスから区別可能であると仮定する場合（Benign assumptions）で手法が機能することを証明し、単純化されたオーバーパラメータ化された設定において、代表例となるベースラインを超えることが保証されることを示しました。CoLOR を拡張性と頑健性を持たせるための技術を開発し、画像データとテキストデータにおいて包括的な実証的な評価を行いました。結果は、背景シフト下において既存のオープンセット認識手法を著しく上回る CoLOR の性能を示しました。さらに、新規クラスサイズの要因など、パフォーマンスに影響を与える因子について、以前の研究では十分に探求されてこなかった新たな知見を提供しました。

Original Content

arXiv:2512.01152v3 Announce Type: replace-cross Abstract: As we deploy machine learning systems in the real world, a core challenge is to maintain a model that is performant even as the data shifts. Such shifts can take many forms: new classes may emerge that were absent during training, a problem known as open-set recognition, and the distribution of known categories may change. Guarantees on open-set recognition are mostly derived under the assumption that the distribution of known classes, which we call the background distribution, is fixed. In this paper we develop CoLOR, a method that is guaranteed to solve open-set recognition even in the challenging case where the background distribution shifts. We prove that the method works under benign assumptions that the novel class is separable from the non-novel classes, and provide theoretical guarantees that it outperforms a representative baseline in a simplified overparameterized setting. We develop techniques to make CoLOR scalable and robust, and perform comprehensive empirical evaluations on image and text data. The results show that CoLOR significantly outperforms existing open-set recognition methods under background shift. Moreover, we provide new insights into how factors such as the size of the novel class influences performance, an aspect that has not been extensively explored in prior work.