arxiv_cs_lg 2026年4月24日

プライバシー保護初期化と正規性に基づく集合化を備えた微分プライバシー付き集約連方学習

Differentially Private Clustered Federated Learning with Privacy-Preserving Initialization and Normality-Driven Aggregation

Open original article

Translated: 2026/4/24 19:59:27

federated-learningdifferential-privacymachine-learningdistributed-systemsdata-privacy

Japanese Translation

arXiv:2604.20596v1 発表タイプ：新要旨：連方学習（FL）は、元データを端末デバイスに保持しながらグローバルモデルのトレーニングを可能にする。にもかかわらず、FL は機密性のあるユーザー情報の漏洩を示すことがあり、実践的には、参加者に形式的なプライバシー保証を与えるために、微分プライバシー（DP）や安全ベクトル和などの手法と組み合わせて用いられる。現実的なクロスデバイス環境では、データは高度に異質なため、標準的な連方学習は収束が遅く、一般化能力が低下する。集約連方学習（CFL）はこれを緩和し、ユーザーをクラスタに区別することで、クラスタ間のデータ異質性を低下させる。ただし、CFL を DP と組み合わせることは依然として課題であり、注入された DP 雑音により個々のクライアントの更新が過剰に雑音になり、サーバーは雑音が少ない集約更新を用いてクラスタ中心を初期化することができない。この課題に対処するために、我々は、各クライアントが軽量ランク適応（LoRA）アダプターをファインチューニングし、更新の圧縮されたスケッチを秘密裡に共有させる第一阶段、およびサーバーがこれらのスケッチを活用して頑健なクラスタ中心を構築する、2 段階のフレームワーク「PINA」を提案した。第二段階では、PINA は収束性と頑健性を向上させる正規性に基づく集合化機構を導入した。我々の手法は、クラスタ化 FL の利点を保持しつつ、不審なサーバーに対する形式的なプライバシー保証を提供する。広範な評価では、我々の提案した手法は、プライバシー予算（epsilon in {2, 8}）において精度が平均 2.9% 向上し、最先进的 DP-FL アルゴリズムを上回ることが示された。

Original Content

arXiv:2604.20596v1 Announce Type: new Abstract: Federated learning (FL) enables training of a global model while keeping raw data on end-devices. Despite this, FL has shown to leak private user information and thus in practice, it is often coupled with methods such as differential privacy (DP) and secure vector sum to provide formal privacy guarantees to its participants. In realistic cross-device deployments, the data are highly heterogeneous, so vanilla federated learning converges slowly and generalizes poorly. Clustered federated learning (CFL) mitigates this by segregating users into clusters, leading to lower intra-cluster data heterogeneity. Nevertheless, coupling CFL with DP remains challenging: the injected DP noise makes individual client updates excessively noisy, and the server is unable to initialize cluster centroids with the less noisy aggregated updates. To address this challenge, we propose PINA, a two-stage framework that first lets each client fine-tune a lightweight low-rank adaptation (LoRA) adapter and privately share a compressed sketch of the update. The server leverages these sketches to construct robust cluster centroids. In the second stage, PINA introduces a normality-driven aggregation mechanism that improves convergence and robustness. Our method retains the benefits of clustered FL while providing formal privacy guarantees against an untrusted server. Extensive evaluations show that our proposed method outperforms state-of-the-art DP-FL algorithms by an average of 2.9% in accuracy for privacy budgets (epsilon in {2, 8}).