arxiv_cs_lg 2026年2月10日

分布シフトとドリフトを考慮したプロフィールマッピング付きフェデレーテッド学習

Federated Learning with Profile Mapping under Distribution Shifts and Drifts

Translated: 2026/3/15 14:10:37

federated-learningmachine-learningdata-heterogeneityprivacy-preservingmodel-aggregation

Japanese Translation

arXiv:2602.07671v1 Announce Type: new フェデレーテッド学習（FL）は、生データを共有せずにクライアント間での分散モデルトレーニングを可能にしますが、実際のデータ異質性下ではパフォーマンスが低下します。既存の方法は、クライアント間の分布シフトや時間経過に伴う分布ドリフトに対応できておらず、あるいはクライアントクラスターの数やデータ異質性の類型など現実的な前提を仮定しているため、その汎用性が制限されています。私たちは、クライアントやクラスターの同一性を依存することなく、分布シフトおよびドリフトを明示的に扱える新しい FL フレームワーク「Feroma」を導入します。Feroma は、局所データのコンパクトでプライバシー保護された表現であるクライアント分布プロファイルに基づき、モデルの統合とテスト時のモデル割り当てるために適応的な類似度ベースの重み付けを利用しています。この設計により、Feroma はトレーニング中にクラスターベースから個人化されたアプローチまで含む動的に統合戦略を選択し、再学習やオンライン適応、クライアントデータの事前知識なしに見知らぬラベル付けされていないテストクライアントに適切なモデルをデプロイできます。広範な実験では、10 つの最先端手法と比較して、動的なデータ異質性条件下で Feroma がパフォーマンスと安定性を向上させ、6 つのベンチマークでベースラインより平均精度が最大 12 パーセントポイント上昇したことを示しました。これは、計算および通信オーバーヘッドは FedAvg と同等の水準を維持しながらもです。これらの結果は、分布プロフィールベースの統合アプローチが、データ分布のシフトとドリフト両方の条件下で堅牢な FL を達成するための実用的な道を開くことを示唆しています。

Original Content

arXiv:2602.07671v1 Announce Type: new Abstract: Federated Learning (FL) enables decentralized model training across clients without sharing raw data, but its performance degrades under real-world data heterogeneity. Existing methods often fail to address distribution shift across clients and distribution drift over time, or they rely on unrealistic assumptions such as known number of client clusters and data heterogeneity types, which limits their generalizability. We introduce Feroma, a novel FL framework that explicitly handles both distribution shift and drift without relying on client or cluster identity. Feroma builds on client distribution profiles-compact, privacy-preserving representations of local data-that guide model aggregation and test-time model assignment through adaptive similarity-based weighting. This design allows Feroma to dynamically select aggregation strategies during training, ranging from clustered to personalized, and deploy suitable models to unseen, and unlabeled test clients without retraining, online adaptation, or prior knowledge on clients' data. Extensive experiments show that compared to 10 state-of-the-art methods, Feroma improves performance and stability under dynamic data heterogeneity conditions-an average accuracy gain of up to 12 percentage points over the best baselines across 6 benchmarks-while maintaining computational and communication overhead comparable to FedAvg. These results highlight that distribution-profile-based aggregation offers a practical path toward robust FL under both data distribution shifts and drifts.