Hidden Data Privacy Breaches in Federated Learning

要約

Federated Learning (FL) は、広範囲の分散データセットにわたって機械学習を実行するためのパラダイムとして登場し、直接データ共有の必要性を排除することでプライバシーの強化を約束します。
ただし、最近の研究では、攻撃者がモデル操作や勾配分析を通じてプライベートデータを盗む可能性があることが示されています。
既存の攻撃は、盗難量が少ないか低解像度のデータによって制限されており、勾配や重みの異常監視を通じて検出されることがよくあります。
この論文では、特徴的でスパースなエンコーディング設計とブロック分割という 2 つの主要な技術によってサポートされる、悪意のあるコードインジェクションを活用した新しいデータ再構築攻撃を提案します。
モデルへの検出可能な変更を必要とする従来の方法とは異なり、私たちの方法はパラメータ共有を使用して隠されたモデルを密かに埋め込み、機密データを体系的に抽出します。
フィボナッチベースのインデックス設計により、記憶されたデータの効率的で構造化された検索が保証され、ブロック分割手法により、高解像度画像をより小さく管理しやすい単位に分割することで、高解像度画像を処理する手法の機能が強化されます。
4 つのデータセットに対する広範な実験により、私たちの手法が 5 つのそれぞれの検出方法の下での 5 つの最先端のデータ再構成攻撃よりも優れていることが確認されました。
私たちの手法は、最先端のデータ再構成防御手法によって検出または軽減されることなく、大規模かつ高解像度のデータを処理できます。
ベースラインとは対照的に、私たちの手法は FedAVG と FedSGD の両方のシナリオに直接適用でき、開発者がそのような脆弱性に対する新しい防御策を考案する必要性を強調しています。
承認され次第、コードをオープンソース化します。

要約(オリジナル)

Federated Learning (FL) emerged as a paradigm for conducting machine learning across broad and decentralized datasets, promising enhanced privacy by obviating the need for direct data sharing. However, recent studies show that attackers can steal private data through model manipulation or gradient analysis. Existing attacks are constrained by low theft quantity or low-resolution data, and they are often detected through anomaly monitoring in gradients or weights. In this paper, we propose a novel data-reconstruction attack leveraging malicious code injection, supported by two key techniques, i.e., distinctive and sparse encoding design and block partitioning. Unlike conventional methods that require detectable changes to the model, our method stealthily embeds a hidden model using parameter sharing to systematically extract sensitive data. The Fibonacci-based index design ensures efficient, structured retrieval of memorized data, while the block partitioning method enhances our method’s capability to handle high-resolution images by dividing them into smaller, manageable units. Extensive experiments on 4 datasets confirmed that our method is superior to the five state-of-the-art data-reconstruction attacks under the five respective detection methods. Our method can handle large-scale and high-resolution data without being detected or mitigated by state-of-the-art data reconstruction defense methods. In contrast to baselines, our method can be directly applied to both FedAVG and FedSGD scenarios, underscoring the need for developers to devise new defenses against such vulnerabilities. We will open-source our code upon acceptance.

arxiv情報

著者	Xueluan Gong,Yuji Wang,Shuaike Li,Mengyuan Sun,Songze Li,Qian Wang,Kwok-Yan Lam,Chen Chen
発行日	2024-11-27 12:04:37+00:00
arxivサイト	arxiv_id(pdf)

提供元, 利用サービス

arxiv.jp, Google

Hidden Data Privacy Breaches in Federated Learning

要約

要約(オリジナル)

arxiv情報

提供元, 利用サービス

最近の投稿

最近のコメント

アーカイブ

カテゴリー